FILE 'REGISTRY' ENTERED AT 09:38:13 ON 26 APR 90 

USE IS SUBJECT TO THE TERMS OF YOUR CUSTOMER AGREEMENT 

COPYRIGHT (C) 1990 AMERICAN CHEMICAL SOCIETY 

STRUCTURE FILE UPDATES: HIGHEST RN 126720-44-3 

DICTIONARY FILE UPDATES: 22 APR 90 (900422/ED) HIGHEST RN 126693-39-8 
=> d que 12 

L2 0 SEA SERYL( 2W ) PHENYL ( 2W ) ALANYL( 2W )CYSTEINYL( 2W ) ARGINYL( 2W ) 

PROLYL ( 2 W ) I SO ( W ) LEUCYL 

=> d que 13 

L3 0 SEA I SO( W) LEUCYL (2W) GLUT AMYL(2W)THREONYL(2W) LEUCYL (2W)VAL 

YL ( 2W ) ASPARTYL ( 2 W ) I SO ( W ) LEUCYL 

=> d que 14 

L4 0 SEA ALANYL ( 2W ) PROLYL (2W)METHIONYL(2W)ALANYL(2W) GLUTAMYL (2 

W ) GLYCYL ( 2 W ) GLYCYL 



0 SEA HISTID?(2W)GLUTAMYL(2W)VALYL(2W)VALYL(2W)LYSYL(2W)PHE 
NYL ( W ) ALANYL ( 2W ) METHIONYL 



SEQUENCE 



Initial Score = 6 Optimized Score = S Significance = 4.36 

Residue Identity = 35% Matches = 10 Mismatches = 10 

Gaps = 8 Conservative Substitutions = o 

X 10 X 
SFCR — P I EYLVD 1 FQEYPXXX 

ii ii ill iii 

■ i t i iii iii 

TKRDVNNFDODFTREEP I LTLVDEA I VKQ I NGEEFKGFSYFGEDLMP 
700 710 720 730 



10. GUEST-346-1 
A22566 



3— Phosphoshi k imate 1-carboxy vinyl transferase 



ENTRY 
TITLE 

ALTERNATE— NAME 

SOURCE 

ACCESSION 

REFERENCE 
^Authors 
tt Journal 
ttTitle 



^Comment 

GENETIC 

ttName 
SUMMARY 
SEQUENCE 



A22566 #Type Protein 

3-Phosphoshi k imate 1-carboxy vinyl transferase — 

Salmonella typhi murium #EC— number 2.5. 1. 19 
5-eno 1 pyruvy 1 sh i k i mate-3-phosphate synthase 
Sa 1 mone 11a typh i mur i urn 
A22566 

(Sequence translated from the DNA sequence) 

Stalker D. M. » Hiatt W. R. i Comai L. 

J. Biol. Chem. (1985) 260:4724-4728 

A single amino acid substitution in the enzyme 
5— eno 1 pyruvy 1 sh i k i mate— 3— phosphate synthase 
confers resistance to the herbicide glyphosate. 

The authors translated the codon CCT for residue 35 
as Ala. 



aroA 

#Mo 1 ecu 1 ar-we i ght 



46157 ttLength 427 ^Checksum 4952 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



7 Optimized Score = 9 
31% Matches = 10 

12 Conservative Substitutions 



S i gn i f i cance 
Mi smatches 



= 4. 



96 
10 
O 



X lO 
SFCRP I EYLVD- 



20 

IFQE-YPXXX 



NE I VLTGEPRMKERP I GHL VDSLRQGG AN I DYLEQEN YPPLRLRGGFTGGD I 
120 130 140 150 X 160 



COMMENT 

SUMMARY 
SEQUENCE 



aevtj i up Hit: 1 1 id i i cyuiauiuii. 
THIS SEQUENCE HAS NOT BEEN COMPARED TO THE 
NUCLEOTIDE TRANSLATION. 
ttMolecular-weight 86431 ttLength 747 ttChecksum 



3147 



Initial 
Residue 
Gaps 



Score = 
Identity = 



6 Optimized Score = 9 
31% Matches = 10 

12 Conservative Substitutions 



S i gn i f i cance 
Mismatches 



4. 96 
lO 

0 



X 

SFCR PIEY- 



lO 20 
-LVDIFQEYPXXX 



LCI REK YMQK SFQRFPK TPSK YLRN I DGEALVA I ESFYPVFTPPPKKGEDPF 
150 X 160 170 180 X 190 



8. GUEST-346- 1 
A25687 



H-2 class II histocompatibility antigen, E-a/k 



ENTRY 
TITLE 

SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 
# Journal 
ttTi tie 



GENETIC 

ttlntrons 
COMMENT 

SUMMARY 
SEQUENCE 



A25687 ttType Protein (fragment) 

H-2 class II histocompatibility antigen. E-a/k 

beta— 2 chain precursor — Mouse (fragment) 
Mus musculus ttCommon-name house mouse 
A25687 

(Sequence translated -from the DNA sequence) 
Braunstein N. S. , Germain R. N. 
EMBO J. ( 1986) 5=2469-2476 

The mouse E-beta-2 gene = a class II MHC-beta-gene 
with limited intraspecies polymorphism and an 
unusual pattern of transcription. 

34/1, 123/1, 217/1 
THIS SEQUENCE HAS NOT BEEN COMPARED TO THE 
NUCLEOTIDE TRANSLATION. 

ttLength 253 ttChecksum 8540 



Initial 
Res i due 
Gaps 



Score 
I dent i ty 



8 Optimized Score = 9 
30% Matches = 9 

lO Conservative Substitutions 



S i gn i f i cance 
Mismatches 



4. 96 
1 1 

O 



X 10 
SFCRP I E YLVD I F- 



20 

-QEYPXXX 



DMLDNYRASVDRCRNNYDLVD I FMLNLK AEPK VTVYPAKTQPLEHHNLLV 
lOO X 110 120 130 X 140 



GUEST-346- 
B28163 



Protein kinase C, epsilon type - Rat ttEC-number 



ENTRY 
TITLE 

SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 

tt Journal 
ttTitle 

SUPERFAM I L Y 
KEYWORDS 



B28163 ttType Protein 

Protein kinase C, epsilon type - Rat ttEC-number 
2. 7. 1- 

Rattus norvegicus ttCommon-name Norway rat 
B28163 

(Sequence translated -from the mRNA sequence) 

Ono Y. , Fuj i i T. , Ogita K. , Kikkawa U. , Igarashi K. , 

Nishizuka Y. 
J. Biol. Chem. (1988) 263 = 6927-6932 
The structure, expression, and properties o-f 

additional members of the protein kinase C family. 
ttName protein kinase C 

kinaseX phorbol ester receptorX calcium bindingX 
ATP-binding phosphotransferase 



SEQUENCE 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



6 Optimized Score = 9 
28% Matches = 10 

= 15 Conservative Substitutions 

X lO 20 
SFCRP I E Y LVD I F QEYPXXX 



S i gn i f i cance 
Mi smatches 



4. 96 

10 
0 



FNSSYRRGDPEFEAMLEYSQG I VDTVAKESLVD I FPWLQ I FPNRDLALLKRCLK V 
190 200 21 0 220 230 X 240 



GUEST-346- 1 
GNFFG2 



Retrovi rus-related pol polyp rote in ( transposon 



ENTRY 
TITLE 

DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 
ttAuthors 
tt Journal 
»T i 1 1 e 



COMMENT 



GNFFG2 ttType Protein < -fragment ) 

Retrovi rus-related pol polyprotein (transposon 

gypsy) (version 2) - Fruit -fly 
31 -Dec- 1988 ttSequence 31 -Dec- 1988 ttText 30-Jun-1989 
1451. O 15. O 1.0 1.0 2.0 
Drosophi la melanogaster 
A23769 

(Sequence translated -from the DNA sequence) 

Yuki S. i I shi mar u S. i Inouye S. » Saigo K. 

Nucleic Acids Res. (1986) 14:3017-3030 

Identification of genes -for reverse 

transcriptase— 1 ike enzymes in two Drosophi la 
retrotransposons , 412 and gypsy; a rapid detection 
method o-f reverse transcriptase genes using YXDD 
box probes. 

The DNA sequence was obtained from GenBank. release 



SUPERFAM I L Y 
KEYWORDS 
SUMMARY 
SEQUENCE 



ttName pol polyprotein 
reverse transcriptase\ polyprotein 

ttLength 930 



^Checksum 7522 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



5 Optimized Score = 9 
25% Matches = 9 

15 Conservative Substitutions 



S i gn i f i cance 
Mi smatches 



4. 9S 

1 1 
O 



SFCRP IE YL- 



lO 20 
-VDIFQEYPXXX 



NMRVSQEKTRFFKESVEYLGF I VSK DGTK SDPEK VK A I QEYPEPDCVYK VRSFLG 
350 360 370 380 390 X 400 



GUEST-346- 1 

A27366 AMP deaminase, skeletal muscle - Rat ttEC-number 



ENTRY 
TITLE 

ALTERNATE— NAME 

SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 

tt Journal 
ttTi tie 



A27366 ttType Protein 

AMP deaminase, skeletal muscle - Rat ttEC-number 
3. 5. 4. 6 

adeny lie ac i d deam i nase\ AMP am i nase\ myoadeny 1 ate 
deaminase 

Rattus norvegicus ttCommon— name Norway rat 
A27366 

(Sequence translated from the mRNA sequence) 
Sabina R. L. . Marquetant R. . Desai N. M. . Kaletha K. < 

Holmes E. W. 
J. Biol. Chem. (1987) 262=12397-12400 
Cloning and sequence of rat myoadeny late deaminase 

cDNA. Evidence for tissue-specific and 



nciilU(-ycdii lil! a varied i i i — lal diauid \ tui ypt= l "la 



ENTRY 
TITLE 

DATE 

PLACEMENT 
SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 

# Journal 

COMMENT 
COMMENT 



COMMENT 



SUPERFAM I L Y 
KEYWORDS 
SUMMARY 
SEQUENCE 



15-Nov-1984 ttText 
1.0 1. O 



30-Sep- 1 388 



Re i singer P. . Geisert H. 



Chem. < 1983) 



the 



BHTLD ttType Protein 

Hemocyanin d chain - Tarantula (Eurypelma 

cal i f ornica) 
15-Nov-1984 ^Sequence 

646. 0 1. O 1. O 

Eurypelma cal itornica 
A02565 

( Comp 1 ete sequence ) 

Schartau W. . Eyerie R 
Storz H. » Linzen B. 

Hoppe-Sey 1 er ' s 2. Phy s i o 1 
364:1383-1409 

Asn-445 probably binds carbohydrate. 

Residues 169-177 and 319-327 are thought to -form 
copper binding site. The two copper ions bound 
each have 3 nitrogen 1 igands (presumably 
contributed by histidine residues) and share a 
bridging ligand (possibly contributed by a 
tyrosine residue) in addition to binding oxygen. 

The hemocyanins are coppei — containing! oxygen 

transport proteins that are highly conserved but 
■found only in arthropods and molluscs. These 
proteins have a complex and variable quaternary 
structure with homologous chains aggregating to 
form either simple hexamers or multihexamer 
con-f igurat ions. The tarantula hemocyanin is a 
24-chain polymer with seven different chains 
i dent i f i ed. 

ttName hemocyanin 

respiratory protein\ oxygen transports copper 
ttMolecular-weight 72178 ttLength 627 ^Checksum 9707 



Initial 
Residue 
Gaps 



Score = 
Identity = 



6 Optimized Score = 9 
36% Matches = 9 

5 Conservative Substitutions 



S i gn i f i cance 
Mismatches 



4. 96 
1 1 
0 



X 

SFCRPIEY- 



10 20 
-LVDIFGEYPXXX 



NPGVMDDTSTSLRDP I FYR YHRWMDN I FGEYKHRLPSYTHQQLDF 
340 350 360 370 X 380 



04CHC7 



Cytochrome P450XVIIA1, steroid 17alpha-monooxygena 



ENTRY 
TITLE 

ALTERNATE-NAME 
DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 
©Authors 
tt Journal 
ttTi tie 



ttRes i dues 
SUPERFAM I LY 
KEYWORDS 



04CHC7 ttType Protein 

Cytochrome P450XVIIA1. steroid 17alpha-monooxygenase 

- Chicken ttEC-number 1. 14. 99. 9 
cytochrome P450(cl7), steroid 17alpha-hydroxy lase 
30- J un- 1989 ^Sequence 30-Jun-1989 ttText 30-Jun-1989 

14. 0 6. 0 1.0 1. O 1.0 
Ga 1 1 us ga 1 1 us ttCommon-name ch i cken 
JT0318 

(Sequence translated -from the mRNA sequence) 
Ono H. i Iwasaki M. . Sakamoto N. » Mizuno S. 
Gene ( 1 988 ) 66 : 77-85 

cDNA cloning and sequence analysis of a chicken gene 
expressed during the gonadal development and 
homo 1 ogous to mamma 1 i an cytochrome P-450c 1 7. 

1-508 (ONO> 

ttName cytochrome P450 

steroidogenesisX ovary\ testis 



SEQUENCE 



wioiecuiai — weignt -ioood igT,n oz>j> ttunscKsum Mf*; 



Initial Score = 9 Optimized Score = 9 Significance = 4.96 

Residue Identity = 40% Matches = 8 Mismatches = 12 

Gaps = 0 Conservative Substitutions = O 

X lO 20 

SFCRP I EYLVD I FBEYPXXX 

■ ■ ii i > i > 

ii ii i i ii 

SVQLRP YNA I SFSGP I AVFVSVFL I YPLGQSDWFFPPDFG 
140 X 150 160 X 170 



GUEST-346- 
RHRTG 



Gonado 1 i ber i n precursor - Rat 



ENTRY 
TITLE 

ALTERNATE— NAME 

INCLUDES 

DATE 

PLACEMENT 
SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 

# Journal 
ttTitle 



COMMENT 

SUPERFAM I L Y 
KEYWORDS 

FEATURE 
1-23 
24-92 
24-33 
24 

33 



37-92 

SUMMARY 
SEQUENCE 



RHRTG ttType Protein 

Gonado 1 i ber i n precursor - Rat 

gonadotropin releasing hormoneX GnRH\ luteinizing 

hormone releasing hormoneX LHRH 
gonadotropin releasing hormoneX prolactin 

release— inhibiting -factor 
31 -Mar- 1988 ^Sequence 31 -Mar- 1988 #Text 31 -Mar- 1988 

527. O l.O 2. 0 l.O 1.0 

Rattus norvegicus #Common-name Norway rat 
B26173 

(Sequence translated -from the mRNA sequence) 
Adelman J. P. i Mason A. J. . Hay -flick J. S. . Seeburg 
P. H. 

Proc. Nat. Acad. Sci. USA (1986) 83:179-183 
Isolation o-f the gene and hypothalamic cDNA for the 
common precursor o-f gonadotrop in-releasing hormone 
and prolactin release-inhibiting -factor in human 
and rat. 

This hormone stimulates the secretion o-f both 

1 ute i n i z i ng and -f o 1 1 i c 1 e-s t i mu 1 at i ng hormones. 
ttName gonado 1 i ber i n 

reproduct i on\ prolactinX amidationX peptide hormoneX 
hypotha 1 amus 

^Domain signal sequence <SIG>\ 
ttProtein progonadol iber in <PGN>\ 
^Peptide gonado 1 i ber i n <GLN>\ 
#Mod i f i ed-s i te pyrrol i done carboxylic 

acid! in gonado 1 iber in (by homology)\ 
ttModi-f ied-site ami dated carboxyl end o-f 

active gonado 1 iber in (-from Gly-34) (by 

homo 1 ogy ) X 
ttPeptide prolactin release-inhibiting 

-factor <PIF> 

#Mo 1 ecu 1 ar-we i ght 10500 ttLength 92 ^Checksum 1405 



Initial 
Res i due 
Gaps 



Score 
I dent i ty 



7 Optimized Score = 9 
40% Matches = 8 

O Conservative Substitutions 



S i gn i f i cance 
Mismatches 



= 4. 96 
12 
0 



X 10 20 

SFCRP I EYLVD I FQEYPXXX 



SQH WS YGLRPGGK RNTEHL VDSFQEMGK EEDQM AEPQNFE 



30 



40 



50 X 



60 



4. GUEST-346- 1 



4. BHTLD 

5. 04CHC7 

6. GNFFG2 

7. A27366 

8. A25687 

9. B28163 

10. A22566 

11. A24363 

12. B25G87 

13. A26294 

14. HMIVN1 

1 5. F2NTD2 

1 6. F2SPD2 

1 7. F2PMD2 

1 8. UBBYB 

19. A27635 

20. A29278 



UUildQUi lUtil ill pi ci^ui aui — r\ai. 


■JUL 


< 










Hemocyan i n d chain - Tarantula 


627 


6 


9 


4. 


96 


O 


Cytochrome P450XVIIA1. steroid 


508 


6 


9 


4. 


96 


0 


Retrovi r us-related pol polypro 


930 


5 


9 


4. 


96 


0 


AMP deaminase, skeletal muscle 


747 


6 


9 


4. 


96 


o 


H-2 class II histocompat ibi 1 i t 


253 


8 


9 


4. 


96 


o 


Protein kinase C. epsilon type 


737 


6 


9 


4. 


96 


0 


3-Phosphoshikimate 1-carboxyvi 


427 


7 


9 


4. 


96 


0 


3 standard deviations above mean 












thrown +ax mi xucnoncir lai uncoup 


OWD 


8 


8 


3. 


72 


0 


U— *0 /~* 1 ^ cT" cr- TT L— , i o -f- i — i ^-^i-iin r-^ ^ + i k — * i 1 i + 
M C 1 aSS XX nl ST-CJCOTTipaT, ID1 1 1 L 


Ol 7 


8 


8 


3. 


72 


o 


Uncoupling protein - Rat 


307 


8 


8 


3. 


72 


o 


Hemagglutinin precursor - In-fl 


566 


8 


8 


3. 


72 


o 


Photosystem II D2 protein - Co 


353 


7 


8 


3. 


72 


o 


Phot osys tern II D2 protein - Sp 


353 


7 


8 


3. 


72 


0 


Photosystem I I D2 protein - Ga 


353 


7 


8 


3. 


72 


0 


Tubulin beta chain - Yeast < Sa 


457 


7 


8 


3. 


72 


o 


Ig heavy chain precursor V reg 


122 


8 


8 


3. 


72 


0 


Uncoupling protein - Rat 


307 


8 


8 


3. 


72 


o 



GUEST-346-1 
A25104 



Band 3 protein, nonerythroid (MEB3) - Human 



ENTRY 
TITLE 

SOURCE 
ACCESSION 
REFERENCE 
©Authors 

©Journal 
©Title 

SUMMARY 
SEQUENCE. 



A25104 ©Type Protein (fragment) 

Band 3 protein, nonerythroid (MEB3) — Human 

( -fragment) 
Homo sapiens ©Common-name man 
A25 1 04 

(Sequence translated from the mRNA sequence) 
Demuth D. R. » Showe L. C. > Ball an tine M. . Pal umbo A. » 

Fraser P. J. , Cioe L. . Rovera G. , Curtis P. J. 
EMBO J. (1986) 5=1205-1214 

Cloning and structural characterization of a human 
non-erythroid band 3-1 ike protein. 

©Length 865 ©Checksum 7746 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



8 Optimized Score = 10 
29% Matches = 10 

14 Conservative Substitutions 



S i gn i f i cance 
Mismatches 



6. 20 
10 
O 



X 

SFCRPI- 



10 X 
-EYLVD I FQEYPXXX 



EGSFLVRFVSRFTRE I FAFL I SL I F I YETFYKLVK I FQEHPLHGCSASNSSEVD 
440 X 450 460 470 480 



2. GUEST-346-1 

S00929 Photosystem II D2 protein - Barley chloroplast 



ENTRY 
TITLE 
SOURCE 
ACCESSION 
REFERENCE 
©Authors 

©Journal 
©Title 

©Comment 



S00929 ©Type Protein 

Photosystem II D2 protein - Barley chloroplast 
chloroplast Hordeum vulgare ©Common-name barley 
S00929 

(Sequence translated from the DNA sequence) 
E-Fimov V. A. , Andreeva A. V. . Reverdatto S. V. , 

Chakhmakhcheva O. G. 
Nucleic Acids Res. (1988) 16s 5686 

Nucleotide sequence o-f the barley chloroplast psbD 
gene for the D2 protein o-f photosystem II. 

The authors translated the codons GAT -for residue 
167 as Gly. CCA -for residue 171 as Ala. GAT -for 
residue 173 as Ser. and AAA -for residue 318 as 
Leu. 



SEARCH STATISTICS 



Scores s 



Mean 
2 



Median Standard Deviation 

3 1. 42 



T i mes i 



CPU 
OO O 1 = 1 2. 37 



Total Elapsed 

OO : O 1 : 4 1 . OO 



Number of residues: 3406022 
Number o-f sequences searched: 12476 
Number of scores above cutoff: 4735 

Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 



A 100% identical sequence to the query sequence was not found. 



The list of best scores is: 



In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







4 standard deviations above mean 


#### 










1. 


S00929 


Photosystem 1 1 D2 protein — Ba 


353 


9 


9 


4. 


33 


0 


2. 


HMIVN1 


Hemagglutinin precursor - Infl 


5GG 


8 


8 


4. 


23 


0 


3. 


CBLV55 


Cytochrome b559> component E — 


83 


8 


8 


4. 


23- 


O 


4. 


A27817 


Lignin peroxidase precursor - 


373 


8 


8 


4. 


23 


0 


5. 


A25539 


0-Acety 1 homoser i ne-0— acety 1 s 


444 


8 


8 


4. 


23 


0 


6. 


A29278 


Uncoupling protein - Rat 


307 


8 


8 


4. 


23 


O 


7. 


A27635 


Ig heavy chain precursor V reg 


122 


8 


8 


4. 


23 


O 


8. 


A26294 


Uncoupling protein - Rat 


307 


8 


8 


4. 


23 


0 


9. 


A25 1 04 


Band 3 protein, nonerythroid ( 


865 


8 


IO 


4. 


23 


O 


io. 


A24363 


Brown fat mitochondrial uncoup 


306 


8 


8 


4. 


23 


0 


1 1. 


A25687 


H-2 class II histocompat ibi 1 it 


253 


8 


9 


4. 


23 


0 


12. 


B25687 


H-2 class II histocompat ibi 1 it 


217 


8 


8 


4. 


23 


O 






*##* 3 standard deviations above mean 












13. 


F2NTD2 


Photosystem 1 1 D2 protein — Co 


353 


7 


8 


3. 


52 


0 


14. 


F2LVD2 


Photosystem 1 1 D2 protein - Li 


353 


7 


8 


3. 


52 


0 


15. 


GVMS1 1 


Ig heavy chain V region - Mous 


121 


7 


7 


3. 


52 


0 


16. 


F2SPD2 


Photosystem 1 1 D2 protein - Sp 


353 


7 


8 


3. 


52 


0 


17. 


F2PMD2 


Photosystem 1 1 D2 protein - Ga 


353 


7 


8 


3. 


52 


0 


18. 


UBBYB 


Tubulin beta chain - Yeast ( Sa 


457 


7 


8 


3. 


52 


0 


16. 


Q0BE37 


Hypothetical BGLF5 protein - E 


470 


7 


7 


3. 


52 


0 


20. 


HVMS3 


Ig heavy chain precursor V reg 


1 17 


7 


7 


3. 


52 


0 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 

A 100% identical sequence to the query sequence was not found. 



The list of best scores is: 

In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



6 standard deviations above mean **#* 

1. A25104 Band 3 protein, nonerythroid ( 865 8 10 6. 20 O 

4 standard deviations above mean 

2. S0O929 Photosystem 1 1 D2 protein - Ba 353 9 9 4. 96 O 



Query sequence being compared: GUEST-346-1 
Number of sequences searched: 1247G 
Number of scores above cutoff: 4735 

Results of the initial comparison of GUEST-346-1 with: 
Data bank : PIR 21. O. all entries 

10000- 

— # 

N 

U 5000- 
M 

B — * 

E 

R 

— * 

□ 

F 1000- 

s 

E 500- 

9 

U 

E - * 

N 

C 

E 

s ioo- * 



— * 
10- 



score o i : 2: 3 4: s: 6: 

STDEV 0 1 2 3 4 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty 1. 00 

Gap size penalty O. 05 

Cutoff score O 

Randomization group O 



K-tuple 

Joining penalty 
Window size 



2 
20 
32 



Initial scores to save 20 Alignments to save 10 

Optimized scores to save 20 Display context 10 



Gaps = 



o 



X lO 20 

SFCR P I E Y LVD I FGEYPXXX 

If! It III II 

III II III II 

LC I REK YMQKSFQRFPK TPSK YLRN I DGEALVA I ESFYPVFTPPPKKGEDPF 
ISO X 160 170 180 X 190 



Results file guest— 346-1. res made by alexk on Thu 26 Apr 90 1 1 •■ 04 • 49-PDT. 



or*oru- ltt i_vuir- r r /\/\/\ 

I II i I i I I li 

I li I I I I I ■■ 

FNSSYRRGDPEFEAMLEYSGG I VDTVAKESL VD I FPWLQ I FPNRDL ALLKRCLK V 
130 200 21 0 220 230 X 240 



9. GUEST-346-1 

POL2SBDROME RETROV I RUS— RELATED POL POLYPROTEIN (REVERSE TRANSC 

ID POL2SDROME STANDARD ; PRT ; 930 AA. 

AC P 1 0402 ; 

DT O 1 —MAR— 1 989 ( REL. 10, CREATED) 

DT 01 -MAR- 1989 (REL. lO, LAST SEQUENCE UPDATE) 

DT 01 -MAR- 1989 (REL. 10, LAST ANNOTATION UPDATE) 

DE RETROV I RUS-RELATED POL POLYPROTEIN (REVERSE TRANSCRIPTASE 

DE (EC 2.7.7.49); ENDONUCLEASE ) ( TRANSPOSON GYPSY) (GENE NAME : POL) 

DE (VERSION 2) (FRAGMENT). 

OS FRUIT FLY (DROSOPHILA MELANOG ASTER ) . 

oc eukaryota; metazoa; arthropod a; insecta. 

rn c 1 ] ( sequence from n. a. ) 

ra yuki s. , ishimaru s. , inouye s. , saigo k. ; 

rl nucleic acids res. 14:3017-3030(1986). 

dr pir; a23769; gnffg2. 

dr embl; x03734; dmgypsy. 

kw hydrolase; endonuclease ; rna-directed dna polymerase; polyprotein. 

ft non_ter 1 1 

ft non_ter 930 930 

sq sequence 930 aa; 105820 mw ; 4453198 cn; 

Initial Score = 5 Optimized Score = 9 S i gn i -f i cance = 4.74 

Residue Identity = 25% Matches = 9 Mismatches = 11 

Gaps = 15 Conservative Substitutions = O 

X 10 20 
SFCRP I E YL VD I FQEYPXXX 

i ill i i i i i 

i lit t iiii 

NMRVSQEKTRFFKESVEYLGF I VSKDGTKSDPEKVKA I QEYPEPDCVYK VRSFLG 
350 3GO 370 380 390 X 400 



lO. GUEST-346- 
AMDM33RAT 



AMP DEAMINASE (EC 3.5.4.6) ( MYOADENYLATE DEAMINASE 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
RN 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
KW 
SQ 



AMDMSRAT 
PI 0759? 
01-JUL-1989 
01-JUL-1989 
01— JUL— 1989 
AMP DEAMINASE 



STANDARD ; 



prt; 



747 AA. 



vertebrata; tetrapoda; mammalia; 



36-548) 

, HOLMES E. W. 



(rel. 11, created) 
( rel. 1 1 , last sequence update ) 
(rel. 11, last annotation update) 
(ec 3.5.4.6) (myoadenylate deaminase) 
rat (rattus norvegicus). 
eukaryota; metazoa; chordata 
eutheriaj rodent i a. 

ci] ( muscle , sequence from n. a. , and sequence of 
sabina r. l. , marquetant r. , desai n. m. , kaletha k 
j. biol. chem. 262:12397-12400(1987). 

-!- function s amp deaminase plays a critical role in energy 
metabolism. 

catalytic activity: amp + h(2>0 = imp + nh(3). 
pathway : purine nucleotide cycle. 

in mammals the expression of amp deaminase is developmental and 
tissue-specific controlled, 
embl; j028u; rsampda. 
hydrolase. 

sequence 747 aa; 86431 mw; 2894385 cn j 



Initial Score 



6 Optimized Score = 



9 S i gn i -F i cance = 4. 74 



HL f to*} ( i 

DT 01 -APR- 1988 (REL. 07 , CREATED) 

DT 01 -APR- 1988 (REL. 07, LAST SEQUENCE UPDATE) 

DT 01 -NOV- 1988 (REL. 09, LAST ANNOTATION UPDATE) 

DE 3-PHOSPHOSHIKIMATE 1 -CARBOXY V I N YLTRANSPERASE (EC 2.5. 1. 19) 

DE ( 5— ENOLPYRUVYLSH I K I MATE— 3— PHOSPHATE SYNTHASE) ( EPSP SYNTHASE) 

DE (GENE NAME: AROA). 

OS SALMONELLA TYPHI MURIUM. 

oc prokaryota; bacteria; gram-negative facultatively anaerobic rods; 

oc enterobacteriaceae. 

RN [ 1 ] ( sequence from N. A. ) 

RA STALKER D. M. , HI ATT W. R. , COMA I L. ; 

RL J. BIOL. CHEM. 260:4724-4728(1985). 

CC -!- CATALYTIC ACTIVITY: PHOSPHOENOLPYRUVATE + 3— PHOSPHOSH I K I MATE = 

CC ORTHOPHOSPHATE + O ( 5 ) - ( 1 -C ARBOX YV I N YL ) -3-PHOSPHOSH I K I MATE. 

CC -!- PATHWAY: SIXTH STEP IN THE BIOSYNTHESIS FROM CHOR I SMATE OF THE 

CC AROMATIC AMINO ACIDS (THE SH I K I MATE PATHWAY). 

CC -!- SUBUNIT: MONOMER I C. 

DR EMBL; Ml 0947; STAROAPM. 

KW AROMATIC AMINO ACID BIOSYNTHESIS; TRANSFERASE. 

FT VARIANT 101 101 P -) S (CONFERS GLYPHOSATE INHIBITION). 

FT ACT_SITE 408 408 PUTATIVE. 

SQ SEQUENCE 427 AA; 46157 MW; 905386 CN; 

Initial Score = 7 Optimized Score = 9 Significance = 4.74 

Residue Identity = 31% Matches = 10 Mismatches = 10 

Gaps = 12 Conservative Substitutions = O 



X lO 20 
SFCRP I E YL VD 1 FQE— YPXXX 

lit ill i i i ■ 

lit ill iiii 

NE I VLTGEPRMKERP I GHLVDSLRQGGAN I DYLEQENYPPLRLRGGFTGGD I 
120 130 140 150 X 160 



8. GUEST-346- 1 

CPT1 SCHICK CYTOCHROME P450 XVI IA1 (P450-C17) (EC 1. 14. 99. 9) ( 



ID CPT1 SCHICK STANDARD! PRT i 508 AA. 

AC PI 2394; 

DT Ol -OCT- 1989 (REL. 12, CREATED) 

DT 01 -OCT- 1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT Ol -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE CYTOCHROME P450 XVI I Al (P4SO-C17) (EC 1. 14. 99. 9) (STEROID 17-ALPHA- 

DE HYDROXYLASE/ 17, 20 LYASE) (GENE NAME: CYP 17). 

OS CHICKEN (GALLUS GALLUS). 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; aves. 

RN [ 1 ] ( SEQUENCE FROM N. A. ) 

RA ONO H. , IWASAKI M. , SAKAMOTO N. , MIZUNO S. ; 

RL GENE 66: 77-85 ( 1988). 

CC -!- FUNCTION: CYTOCHROMES P450 ARE A GROUP OF HEME— TH I OLATE 

CC MONOOXYGENASES. THEY OXIDIZE A VARIETY OF STRUCTURALLY UNRELATED 

CC COMPOUNDS, INCLUDING STEROIDS, FATTY ACIDS, AND XENOBIOTICS. 

CC -!- CATALYTIC ACTIVITY: A STEROID + AH(2) + 0(2) = A 17-ALPHA- 

CC HYDROXYSTERO I D + A + H(2)0. 

DR PIR; JT0318! 04CHC7. 

KW ELECTRON TRANSPORT ; OX I DOREDUCT ASE J MONOOXYGENASE J MEMBRANE; 

KW HEME; STEROIDOGENESIS. 

FT BINDING 445 445 HEME. 

SQ SEQUENCE 508 AA! 56984 MW J 1333572 CN! 

Initial Score = 6 Optimized Score = 9 Significance = 4.74 

Residue Identity = 28% Matches - 10 Mismatches = 10 

Gaps = 15 Conservative Substitutions = O 



X 



10 



20 



DR PIR; B26173; RHRTG. 

KW AM I DAT I ON? HORMONE ; HYPOTHALAMUS I PLACENTA J SIGNAL. 



FT 


SIGNAL 


1 


23 




FT 


CHAIN 


24 


92 


PROGONADOL I BER I N. 


FT 


PEPTIDE 


24 


33 


GONADOL I BER I N. 


FT 


PEPTIDE 


37 


92 


PROLACTIN RELEASE-INHIBITING FACTOR. 


FT 


ACT_SITE 


26 


2G 


APPEARS TO BE ESSENTIAL FOR BIOLOGICAL 


FT 








ACTIVITY. 


FT 


MOD_RES 


24 


24 


PYRROL I DONE CARBOXYLIC ACID. 


FT 


MOD_RES 


33 


33 


AM I DAT I ON <G-34 PROVIDE AMIDE GROUP). 


SO 


SEQUENCE 


92 aa; 


10500 mw; 


39210 cn; 



Initial Score = 
Residue Identity = 
Gaps = 



7 Optimized Score = 9 
40% Matches = 8 

O Conservative Substitutions 



S i gn i f i cance 
Mi smatches 



= 4. 



74 
12 
O 



X 10 20 

SFCRP I EYLVD I FQEYPXXX 

i i iti iii 

i i iii iii 

SQHWSYGLRPGGKRNTEHLVDSFQEMGKEEDQMAEPQNFE 
30 X 40 50 X SO 



6. GUEST-346-1 
AROASECOLI 



3-PHOSPHOSHIKIMATE 1 —CARBOXYV I NYL TRANSFERASE (EC 2 



ID 
AC 
DT 
DT 
DT 
DE 
DE 
DE 
OS 
OC 
OC 
RN 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
DR 
KW 
FT 
SO 



AROASECOLI 
P07638 ; 
01 -APR- 1988 
01 -APR- 1988 
Ol -NOV- 1988 



STANDARD i 



prt; 



427 AA. 



<REL. 
(REL. 
(REL. 



3-PH0SPH0SH IK I MATE 



07, CREATED) 

07, LAST SEQUENCE UPDATE) 
09, LAST ANNOTATION UPDATE) 
1 -CARBOXYV I NYLTRANSFERASE ( EC 



< 5— ENOLPYRUVYLSH I K I MATE— 3— PHOSPHATE SYNTHASE ) ( EPSP 



5. 1. 19) 
SYNTHASE) 



(gene name: aroa). 
escherichia col i. 

prokaryota; bacteria; gram-negative facultatively anaerobic rods; 

enterobacter i aceae. 

[ 1 ] ( sequence from n. a. ) 

duncan k. , lewendon a. . coggins j. r. 5 

febs lett. 1 70 = 59-63 ( 1984). 

-!- catalytic activity: phosphoenolpyruvate + 3-phosphoshikimate = 
orthophosphate + o ( 5 ) - ( 1 -carboxyv i nyl ) -3-ph0sph0sh i k i mate. 

-!- pathway: sixth step in the biosynthesis from chor i smate of the 
aromatic amino acids (the sh i k i mate pathway). 

-!- subunit: monomer i c. 

embl; X00557; ecaroa. 

AROMATIC AMINO ACID BIOSYNTHESIS; TRANSFERASE. 
ACT_SITE 408 408 PUTATIVE. 
SEQUENCE 427 AA; 46164 MW; 892569 CN; 



Initial Score 
Residue Identity 
Gaps 



7 Optimized Score = 9 
31% Matches = lO 

12 Conservative Substitutions 



X lO 
SFCRP I EYLVD- 



S i gn i -f i cance 
Mi smatches 



4. 74 
lO 

O 



20 

•IFQE-YPXXX 



ND I VLTGEPRMKERP I GHLVDALRLGGAK I TYLEQENYPPLRLQGGFTGGNV 
120 130 140 150 X 160 



7. GUEST-346- 1 

AROA33SALTY 3-PHOSPHOSHIKIMATE 1 -CARBOXYV I NYLTRANSFERASE (EC 2 



ID 



AROASSALTY 



STANDARD ; 



prt; 



427 AA. 



140 X ISO 160 X 170 



GUEST-346-1 
PSBDSHORVU 



PHOTOS YSTEM 1 1 D2 PROTEIN (GENE NAME = PSBD). 



ID 
AC 
DT 
DT 
DT 
DE 
□S 
OG 
OC 
RN 
RA 
RA 
RL 
CC 
CC 
CC 
CC 
DR 
KW 
KW 



PSBDSHORVU 
PI 1849; 
Ol -OCT- 1989 
01 -OCT- 1989 
01 -OCT- 1989 
PHOTOSYSTEM 



STANDARD ; 



prt; 



353 AA. 



V. , JUNG R. 



(rel. 12, created) 
(rel. 12, last sequence update) 
(rel. 12, last annotation update) 
ii d2 protein (gene name: psbd). 
barley (hordeum vulgare). 
chloroplast. 

eukaryota; planta; spermatophyta. 
cu ( cv. donetsky 6 , sequence from n. a. ) 
efimov v. a. , andreeva a. v. , reverdatto s. 
chakhmakhcheva o. g. ; 
nucleic acids res. 16 = 5686-5686(1988). 

-!- function s this is one of the two reaction centre proteins of psii, 
d2 protein is needed for assembly of a stable psii complex. 

-!- similarity^ bacterial reaction center l and m chains, and plants 
photosystem ii dl and d2 proteins are related. 

EMBL! X07522 5 HVD2PSBD. 

TRANSMEMBRANE; ELECTRON TRANSPORT; THYLAKOID MEMBRANE; PHOTOSYSTEM II; 
CHLOROPLAST; IRON. 



FT 


TRANSMEM 


36 


57 




FT 


TRANSMEM 


109 


129 




FT 


TRANSMEM 


142 


164 




FT 


TRANSMEM 


192 


218 




FT 


TRANSMEM 


266 


286 




FT 


METAL 


215 


215 


IRON (NON HAEM). 


FT 


METAL 


225 


225 


IRON (NON HAEM). 


FT 


METAL 


269 


269 


IRON (NON HAEM). 


SQ 


SEQUENCE 


353 aa; 


39669 


MW; 630865 CN; 



Initial Score 
Residue Identity 
Gaps 



9 Optimized Score = 9 
40% Matches = 8 

O Conservative Substitutions 



S i gn i f i cance 
Mismatches 



= 4. 



74 
12 
O 



X lO 20 

SFCRP I EYLVD I FQEYPXXX 



SVQLRPYNA I SFSGP I AVFVSVFL I YPLGQSDWFFPPDFG 
140 X ISO 160 X 170 



5. GUEST-346-1 

GONL93RAT GON ADOL I BER I N PRECURSOR (LHRH) (LUTEINIZING HORMON 



ID 


GQNLSRAT 


standard; PRT? 92 AA. 


AC 


P07490 ; 




DT 


01 -APR- 1988 


(REL. 07, CREATED) 


DT 


Ol -APR- 1988 


(REL. 07, LAST SEQUENCE UPDATE) 


DT 


01 -MAR- 1989 


(REL. lO, LAST ANNOTATION UPDATE) 


DE 


GON ADOL I BER I N 


PRECURSOR (LHRH) (LUTEINIZING HORMONE RELEASING 


DE 


HORMONE) (GONADOTROPIN RELEASING HORMONE) (GNRH). 


OS 


RAT (RATTUS NORVEGICUS). 


OC 


eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 


OC 


eutheria; rodent i a. 


RN 


C 1 ] ( sequence 


FROM N. A. ) 


RA 


adelman J. P. . 


MASON A. J. , HAYFLICK J. S. , SEEBURG P. H. J 


RL 


PROC. NATL. ACAD. SCI. U.S.A. 83=179-183(1986). 


CC 


-!- FUNCTION: 


STIMULATES THE SECRETION OF BOTH LUTEINIZING AND 


CC 


follicle- 


STIMULATING HORMONES. 







t 1KUIN. 






FT 


TRANSMEM 


36 


57 




FT 


TRANSMEM 


103 


129 




FT 


TRANSMEM 


142 


164 




FT 


TRANSMEM 


192 


218 




FT 


TRANSMEM 


266 


286 




FT 


METAL 


215 


215 


IRON (NON HAEM ) . 


FT 


METAL 


225 


225 


IRON (NON HAEM). 


FT 


METAL 


269 


269 


IRON (NON HAEM). 


S9 


SEQUENCE 


353 aa; 


39571 


MW; 631 135 cn; 



Initial Score = 9 Opt imized Score = 9 Significance = 4.74 

Residue Identity = 40% Matches = 8 Mismatches = 12 

Gaps = O Conservative Substitutions = O 



X 10 20 

SFCRP I EYLVD I FBEYPXXX 

ii it i i it 

II II I I ii 

SVQLRPYNA I SFSSP I AVFVSVFL I YPLGQSGWFFAPSFG 
140 X ISO 160 X 170 



3. GUEST-346-1 
PSBDSORYSA 



PHOTOS YSTEM 1 1 D2 PROTEIN (GENE NAME = PSBD), 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OG 
OC 
RN 
RA 
RL 
RN 
RA 
RA 
RA 
RL 
CC 
CC 
CC 
CC 
DR 
KW 
KW 



(REL. 
( REL. 
(REL. 
II D2 



12, CREATED) 

12, LAST SEQUENCE UPDATE) 
12, LAST ANNOTATION UPDATE) 
PROTEIN (GENE NAME: PSBD). 



PSBD$ORYSA STANDARD; PRTf 353 AA. 

PI 2095; 
01 -OCT- 1989 
O 1 —OCT— 1 989 
01 -OCT- 1989 

photosystem 
rice (oryza sativa). 
chloroplast. 

eukaryota; planta; spermatophyta. 

ci] (cv. nipponbare, sequence from n. a. > 

sugiura m. ; 

submitted ( jul- 1989) to the embl data library. 
121 (gene organization, sites, and features) 

HIRATSUKA J. , SHIM ADA H. , WHITTIER R. , I SHI BASH I T. , SAKAMOTO M. , 
MORI M. , KONDO C. , HON J I Y. , SUN C. -R. , MENG B. -Y. , LI Y. — Q. , 
KANNO A. , NISHIZAWA Y. , HIRAI A. , SHINOZAKI K. , SUGIURA M. ; 
MOL. GEN. GENET. 217:185-194(1989). 

-!- FUNCTION: THIS IS ONE OF THE TWO REACTION CENTRE PROTEINS OF PS I I 
D2 PROTEIN IS NEEDED FOR ASSEMBLY OF A STABLE PS 1 1 COMPLEX. 

-!- SIMILARITY: BACTERIAL REACTION CENTER L AND M CHAINS, AND PLANTS 
PHOTOSYSTEM II Dl AND D2 PROTEINS ARE RELATED. 

embl; X15901; chosxx. 

transmembrane; electron transport; thylakoid membrane; photos ystem ii 
chloroplast; iron. 



FT 


TRANSMEM 


36 


56 




FT 


TRANSMEM 


109 


129 




FT 


TRANSMEM 


142 


164 




FT 


TRANSMEM 


192 


218 




FT 


TRANSMEM 


266 


286 




FT 


METAL 


215 


215 


IRON (NON HAEM). 


FT 


METAL 


225 


225 


IRON (NON HAEM). 


FT 


METAL 


269 


263 


IRON (NON HAEM). 


SQ 


SEQUENCE 


353 AAJ 39573 MW! 


631 107 cn; 


Initial Score 




9 Optimized Score = 


Res i due I dent i t y 




40% Matches 





Gaps 



9 
8 

Conservative Substitutions 



S i gn i f i cance 
Mismatches 



4. 74 

12 
O 



X 10 20 

SFCRP I EYLVD I FQEYPXXX 



1 1. 


KPCE$RAT 


PROTEIN KINASE C» EPSILON TYPE 


737 


6 


9 


4. 


74 


0 


12. 


YCY1SSPI0L 


HYPOTHETICAL 250 KD PROTEIN (0 


2131 


6 


S 


4. 


74 


0 


13. 


HCYDSEURCA 


HEMOCYANIN D CHAIN. 


627 


6 


9 


4. 


74 


0 






3 standard deviations above mean 












14. 


UCPSMOUSE 


kji t Tnpi inMHni ai nnni ik.i r*"/\*T* 1 ik h— *r**t in 

MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


3. 


56 


0 


15. 


L I G 1 SPHACH 


L I GN I N ASE PRECURSOR ( EC 1 . 11. 1 


372 


8 


8 


3. 


56 


0 


16. 


UCPSRAT 


MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


3. 


56 


o 


17. 


UCPSMESAU 


MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


3. 


56 


o 


IS. 


PSBE$MARPO 


CYTOCHROME B559 ALPHA CHAIN (G 


83 


8 


8 


3. 


56 


o 


13. 


MET5$YEAST 


O-ACETYLHOMOSER I NE ( TH I OL ) -L YA 


444 


8 


8 


3. 


56 


0 


20. 


HEMASINASW 


HEMAGGLUTININ PRECURSOR. 


566 


8 


8 


3. 


56 


o 



GUEST-346-1 
B3LP93HUMAN 



NON-ERYTHRO I D BAND 3-LIKE PROTEIN < HKB3 ) (FRAGMENT 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
RN 
RA 
RA 
RL 
DR 
FT 
SQ 



B3LP$HUMAN 
P04920 ; 
13-AUG-1987 
13-AUG-1987 
13-AUG-1987 
NON-ERYTHRO ID 



STANDARD I 



PRT5 



865 AA. 



( REL. 05 , CREATED ) 
(REL. OS, LAST SEQUENCE UPDATE) 
(REL. 05, LAST ANNOTATION UPDATE) 
BAND 3-LIKE PROTEIN (HKB3) (FRAGMENT). 



PALUMBO A. 



FRASER P. J. 



human (homo sapiens). 

eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

eutheria; primates. 

C U ( sequence from N. A. ) 

DEMUTH D. R. , SHOWE L. C. , BALLANTINE M. 
CIOE L. , ROVERA G. , CURTIS P. J. ; 
EMBO J. 5:1205-1214(1986). 
EMBL; X03918J HSHKB3R. 
NON_TER 1 1 

SEQUENCE 865 AA! 95959 MWJ 4025588 CN! 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



X 

SFCRPI- 



8 Optimized 
29% Matches 
14 Conservative 



Score = 1 O 

10 

Substitutions 



10 X 
-EYLVD I FQEYPXXX 



Signi + icance 
Mismatches 



5. 93 

lO 
O 



EGSFLVRFVSRFTRE I FAFL I SL I F I YETFYKLVK I FQEHPLHGCSASNSSEVD 
440 X 450 460 470 480 



I. GUEST-346-1 

PSBD33SECCE PHOTOSYSTEM 1 1 D2 PROTEIN (GENE NAME • PSBD). 

ID PSBD33SECCE STANDARD; PRT J 353 AA. 

AC PI 0803; 

DT Ol -JUL- 1989 (REL. 11, CREATED) 

DT Ol-JUL-1989 (REL. 11, LAST SEQUENCE UPDATE) 

DT O 1 —OCT— 1 989 (REL. 12, LAST ANNOTATION UPDATE) 

DE PHOTOSYSTEM I I D2 PROTEIN (GENE NAME: PSBD). 

OS RYE (SECALE CEREALE). 

OG CHLOROPLAST. 

oc eukaryota; planta; spermatophyta. 

RN C 1 ] ( SEQUENCE FROM N. A. ) 

RA BUKHAROV A. A. , KOLOSOV V. L. , KLEZOVICH O. N. , ZOLOTAREV A. S. ; 

RL NUCLEIC ACIDS RES. 17=798-798(1989). 

CC -!- FUNCTION: THIS IS ONE OF THE TWO REACTION CENTRE PROTEINS OF PS I I 

CC D2 PROTEIN IS NEEDED FOR ASSEMBLY OF A STABLE PS I I COMPLEX. 

CC -!- SIMILARITY' BACTERIAL REACTION CENTER L AND M CHAINS, AND PLANTS 

CC PHOTOSYSTEM II Dl AND D2 PROTEINS ARE RELATED. 

DR EMBL; XI 3366; CHSCPSBD. 

KW TRANSMEMBRANE; ELECTRON TRANSPORT; THYLAKOID MEMBRANE; PHOTOSYSTEM II 



(NumDer ot resiaues 5 ooiujo 
Number of sequences searched! 12305 
Number of scores above cutoffs 3754 



Cut-off raised to 2. 
Cut— off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 100% identical sequence to the query sequence was not found. 



The list of best scores is* 



In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







**** 5 standard deviations < 


above mean 


* 










1. 


PSBDSHORVU 


PHOTOSYSTEM 1 1 D2 PROTEIN (GEN 


353 


9 


i — i 
9 




05 


o 


2. 


PSBDSSECCE 


PHOTOS YSTEM I I D2 PROTEIN (GEN 


353 


9 


■ — i 
9 


5. 


05 


o 


3. 


PSBDSORYSA 


PHOTOSYSTEM II D2 PROTEIN (GEN 


353 


9 


9 


5. 


05 


0 






4 standard deviations above mean 


*#*# 










4. 


HEMASINASW 


HEMAGGLUTININ PRECURSOR. 


566 


8 


8 


4. 


33 


0 


5. 


LIG1SPHACH 


LIGNINASE PRECURSOR (EC 1. 11. 1 


372 


8 


8 


4. 


33 


0 


S. 


B3LP33HUMAN 


NON— ERYTHRO I D BAND 3-LIKE PROT 


865 


8 


IO 


4. 


33 


0 


7. 


MET593YEAST 


0- ACET YLHOMOSER I NE ( TH I OL ) — LYA 


444 


8 


8 


4. 


33 


o 


8. 


PSBE93MARPO 


CYTOCHROME B559 ALPHA CHAIN ( G 


83 


8 


8 


4. 


33 


o 


3. 


UCP93MESAU 


MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


4. 


33 


0 


io. 


UCPSiRAT 


MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


4. 


33 


o 


1 1. 


UCPSMOUSE 


MITOCHONDRIAL BROWN FAT UNCOUP 


306 


8 


8 


4. 


33 


0 






3 standard deviations above mean 


**#* 










12. 


HCYASPANIN 


HEMOCYANIN A CHAIN. 


657 


7 


7 


3. 


61 


0 


13. 


GONL93RAT 


GONADOL I BER I N PRECURSOR (LHRH) 


92 


7 


9 


3. 


61 


0 


14. 


ICICSHIRME 


EGLIN C. 


70 


7 


7 


3. 


61 


0 


15. 


HEM A$ I NATA 


HEMAGGLUT I N I N ( FRAGMENT ) . 


343 


7 


7 


3. 


61 


0 


16. 


HEM A$ I NAUS 


HEMAGGLUTININ PRECURSOR. 


566 


7 


7 


3. 


61 


0 


17. 


HV05SM0USE 


IG HEAVY CHAIN PRECURSOR V REG 


1 17 


7 


7 


3. 


61 


0 


18. 


HVO 1 93MOUSE 


IG HEAVY CHAIN V REGION ( MPC 1 


121 


7 


7 


3. 


61 


0 


19. 


HEMASINAJP 


HEMAGGLUTININ PRECURSOR. 


562 


7 


7 


3. 


61 


0 


20. 


CN1793DICDI 


3 ' , 5 ' -C YCL I C-NUCLEOT I DE PHOSPH 


452 


7 


7 


3. 


61 


0 



The scores below are sorted by optimized score. 

S i gn i f i cance is ca 1 cu 1 ated based on opt i m i zed score. 

A 100% identical sequence to the query sequence was not found. 
The list of best scores is: 

In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 



5 standard deviations above mean #*** 



1. 


B3LP33HUMAN 


NON— ERYTHRO I D BAND 3-LIKE PROT 


865 


8 


10 


5. 


93 


0 






4 standard deviations above mean 












2. 


PSBD95SECCE 


PHOTOSYSTEM I I D2 PROTEIN (GEN 


353 


9 


9 


4. 


74 


O 


3. 


PSBDSORYSA 


PHOTOSYSTEM I I D2 PROTEIN (GEN 


353 


9 


9 


4. 


74 


0 


4. 


PSBDSSHORVU 


PHOTOSYSTEM I I D2 PROTEIN (GEN 


353 


9 


9 


4. 


74 


0 


5. 


GONLSRAT 


GONADOL I BER IN PRECURSOR (LHRH) 


92 


7 


9 


4. 


74 


O 


6. 


AROASECOLI 


3-PH0SPH0SHIKIMATE 1-CARBOXYVI 


427 


7 


9 


4. 


74 


0 


7. 


AROA95SALTY 


3-PHOSPHOSHIKIMATE 1-CARBOXYVI 


427 


7 


9 


4. 


74 


0 


8. 


CPT1 SCHICK 


CYTOCHROME P4SO XVI I Al ( P450-C 


508 


6 


9 


4. 


74 


0 


9. 


POL2SDROME 


RETRO VIRUS-RELATED POL POLYPRO 


930 


5 


9 


4. 


74 


0 



N 

U 5000- 

M 

B 

E 

R 

□ 

F 1000- 



S 
E 
Q 
U 
E 
N 
C 
E 
S 



500* 



100- 



10- 



SCORE 
STDEV 



21 
1 



5: 

3 



6: 

4 



8 



PARAMETERS 



Si mi lari ty matrix Unitary 

Mismatch penalty 1 

Gap pena 1 ty 1 . OO 

Gap s i ze pena 1 ty O. 05 

Cutoff score O 

Randomization group O 



K— tuple 

Joining penalty 
Window size 



2 
20 
32 



Initial scores to save 20 
Optimized scores to save 20 



Alignments to save 
Display context 



lO 
10 



SEARCH STATISTICS 



Scores ; 



Mean 
2 



Median Standard Deviation 

3 1. 39 



T i mes : 



CPU 
OO : 0 1 * 5 1 . 03 



Total Elapsed 
OO » 0 1 = 54. OO 



O 1 

Results file guest-346-l-spt. res made by alexk on Thu 26 Apr 90 1 1 ■ 06 * 03-PDT. 

Query sequence being compared: GUEST-346-1 
Number of sequences searched: 12305 
Number of scores above cutoff: 3754 



Results of the initial comparison of GUEST— 346— 1 with: 
Data bank : Swiss-Prot 12. all entries 



Results file guest-346-spt. res made by alexk on Thu 26 Apr 90 9:25:15-PDT. 

Query sequence being compared! GUEST-346 -3 

Number of sequences searched! 12305 
Number of scores above cutoff: 3946 

Results of the initial comparison of GUEST-346 with! 
Data bank Swiss-Prot 12, all entries 

1O0O0- 

N 

U 5000- 

M — # * # 

B 

E 

R 

- * 

0 

F 1000- 
S 

E 500- * 

Q 

U 

E 

N 

C * 
E 

S lOO* 
50- 



10- 



0- 



SCORE 0: 1 12 13 4 .'6 7 \ 8 9 10 

STDEV -1 0 1 2 3 4 



PARAMETERS 



Similarity matrix Unitary 

Mismatch penalty 1 

Gap penalty l. 00 

Gap size penalty O. OS 

Cuto-f-f score O 

Randomization group 0 



K-tuple 

Joining penalty 
Window size 



2 
20 
32 



Initial scores to save 20 
Optimized scores to save 20 



Al ignments to save 
Display context 



10 
10 



SEARCH STATISTICS 



Scores s 



Mean 
3 



Median Standard Deviation 

4 1. 39 



Times « 



CPU 
00:01 s 16. 02 



Total Elapsed 

00 : 05 : 07. OO 



Number o-f residues: 3797058 
Number of sequences searched: 12305 
Number of scores above cuto-f-f: 3346 

Cut-o-f-f raised to 2. 
Cut-o-f-f raised to 3. 
Cut-o-f-f raised to 4. 
Cut-o-f-f raised to 5. 

The scores below are sorted by initial score. 

S i gn i -f i cance is calculated based on initial score. 



A 100% identical sequence to the query sequence was not -found. 



The list o-f best scores is: 



In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







5 standard deviations above mean 


**** 










1. 


HIS235NEUCR 


PHOSPHOR I BOS YL- AMP CYCLOHYDROL 


863 


10 


12 


5. 


05 


0 


2. 


CFAH33MOUSE 


COMPLEMENT FACTOR H PRECURSOR 


1234 


10 


1 1 


5. 


05 


0 


3. 


IPSSSTRCL 


ISOPENICILLIN N SYNTHETASE ( IP 


329 


10 


10 


5. 


05 


O 


4. 


□DBISBOVIN 


2-OXOISOVALERATE DEHYDROGENASE 


455 


10 


13 


5. 


05 


O 


5. 


□DB INHUMAN 


2-0X0 I SO VALERATE DEHYDROGENASE 


444 


10 


13 


5. 


05 


0 


6. 


□DB1SRAT 


2-0X0 I SO VALERATE DEHYDROGENASE 


441 


10 


14 


5. 


OS 


O 


7. 


PDGASHUMAN 


PLATELET-DERIVED GROWTH FACTOR 


21 1 


10 


10 


5. 


05 


O 


8. 


PIP1SB0VIN 


1 -PHOSPHAT I DYL I NOS I TOL-4 , 5-B I S 


1216 


10 


12 


5. 


05 


O 


9. 


MERA35STAAU 


MERCURIC REDUCTASE (EC 1. 16. 1. 


547 


io 


1 1 


5. 


05 


O 


10. 


TOXASPSEAE 


EXOTOXIN A PRECURSOR (EC 2.4.2 


638 


10 


12 


5. 


05 


O 


11. 


UL3735HSV 1 1 


PROTEIN UL37 (GENE NAME: UL37) 


1 123 


10 


1 1 


5. 


05 


0 






4 standard deviations above mean 


#### 










12. 


APH6SACIBA 


APH(3')-VI PROTEIN (3'-AMIN0GL 


259 


9 


1 1 


4. 


33 


0 


13. 


CRYTSBACTI 


130 KD CRYSTAL PROTEIN (DELTA 


1 135 


9 


1 1 


4. 


33 


0 


14. 


DEDD33ECOLI 


DEDD PROTEIN (GENE NAME: DEDD) 


21 1 


9 


1 1 


4. 


33 


0 


15. 


FA1 INHUMAN 


COAGULATION FACTOR XI PRECURSO 


625 


9 


9 


4. 


33 


O 


16. 


LPXASEC0LI 


UDP— ACET YLGLUCOSAM I NE ACYLTRAN 


262 


9 


10 


4. 


33 


0 


17. 


CD1233MOUSE 


CD1. 2 SURFACE ANTIGEN PRECURSO 


297 


9 


12 


4. 


33 


O 


18. 


KS6AS5XENLA 


RIBOSOMAL PROTEIN S6 KINASE II 


733 


9 


IO 


4. 


33 


0 


19. 


KS6BSXENLA 


R I BOSOM AL PROTEIN SG KINASE II 


629 


9 


10 


4. 


33 


0 


20. 


EGFSHUMAN 


EPIDERMAL GROWTH FACTOR (EGF> 


1207 


9 


9 


4. 


33 


O 



The scores below are sorted by optimized score. 
Significance is calculated based on optimized score. 



A 100% identical sequence to the query sequence was not -found. 



The list of best scores is: 



In it. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







5 standard deviations above mean 












1. 


0DP2SAZ0VI 


D I H YDROL I PO AM I DE ACETYLTRANSFE 


S38 


9 


14 


5. 


87 


0 


2. 


ODBISRAT 


2-OXOISOVALERATE DEHYDROGENASE 


441 


lO 


14 


5. 


87 


O 


3. 


RADX35YEAST 


DNA REPAIR PROTEIN RADIO < GENE 


210 


7 


14 


5. 


87 


o 






4 standard deviations above mean 












4. 


SK INHUMAN 


SKI ONCOGENE (GENE NAME > SKI). 


728 


8 


13 


4. 


89 


0 


5. 


□DB INHUMAN 


2-OXOISOVALERATE DEHYDROGENASE 


444 


lO 


13 


4. 


89 


0 


6. 


0DB1SB0VIN 


2-OXOISOVALERATE DEHYDROGENASE 


455 


lO 


13 


4. 


89 


o 


7. 


DRTSSLEIMA 


DIHYDROFOLATE REDUCTASE < EC 1. 


520 


4 


13 


4. 


89 


0 


8. 


H3SNEUCR 


HI STONE H3. 


135 


7 


13 


4. 


89 


o 


9. 


PYCSYEAST 


PYRUVATE CARBOXYLASE ( EC 6. 4. 1 


1 178 


8 


13 


4. 


89 


0 


10. 


VP2SBTV13 


VP2 PROTEIN (OUTER CAPS ID PROT 


959 


7 


13 


4. 


89 


0 






3 standard deviations above mean 












1 1. 


PIP1SB0VIN 


1 -PHOSPHAT I D YL I NOS I TOL-4 , 5-B I S 


1216 


lO 


12 


3. 


91 


0 


12. 


TOXASPSEAE 


EXOTOXIN A PRECURSOR (EC 2.4.2 


638 


lO 


12 


3. 


91 


0 


13. 


KPCESRAT 


PROTEIN KINASE C, EPSILON TYPE 


737 


5 


12 


3. 


91 


0 


14. 


KADSMYCCA 


ADENYLATE KINASE (EC 2.7.4.3) 


213 


5 


12 


3. 


91 


0 


15. 


M0D5SYEAST 


TRNA ISOPENTENYL TRANSFERASE ( 


427 


5 


12 


3. 


91 


0 


16. 


0D02SSEC0LI 


D I H YDROL I PO AM I DE SUCC I N YLTRANS 


405 


5 


12 


3. 


91 


o 


17. 


CD 1 2SM0USE 


CD1. 2 SURFACE ANTIGEN PRECURSO 


297 


9 


12 


3. 


91 


0 


IS. 


HPRT93SCHMA 


HYPOX ANTH I NE— GU AN I NE PHOSPHOR I 


284 


5 


12 


3. 


91 


0 


19. 


CPAXSHUMAN 


CYTOCHROME P450 IIA (EC 1. 14. 1 


489 


5 


12 


3. 


91 


o 


20. 


TRA493ECOLI 


TRANSPOSASE ( TRANSPOSON TN2501 


994 


5 


12 


3. 


91 


0 



1. GUEST-346 

0DP2$AZ0VI D I H YDROL I POAM I DE ACETYLTRANSFERASE COMPONENT (E2) 



ID 0DP2$AZ0VI STANDARD; PRT! 638 A A. 

AC PI 0802? 

DT Ol-JUL-1989 ( REL. 11, CREATED) 

DT Ol-JUL-1989 (REL. 11, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE D I H YDROL I POAM I DE ACETYLTRANSFERASE COMPONENT <E2) OF PYRUVATE 

DE DEHYDROGENASE COMPLEX (EC 2.3. 1. 12). 

OS AZOTOBACTER V I NELAND 1 1 . 

oc prokaryota; bacteria; gram-negative aerobic rods and cocci; 

OC AZOTOBACTER I ACEAE. 

RN CI] (STRAIN ATCC478, SEQUENCE FROM N. A. ) 

RA HANEMAAI JER R. , JANSSEN A. , DE KOK A. , VEEGER C. ; 

RL EUR. J. BIOCHEM. 174 = 593-599 < 1988). 

RN C21 (LIPOYL DOMAIN CONFORMATION) 

RA HANEMAAI JER R. , VERVOORT J. , WESTPHAL A. H. , DE KOK A. , VEEGER C. ; 

RL FEBS LETT. 240 : 205-2 1 0 ( 1 988 > . 

CC -!- FUNCTION' THE PYRUVATE DEHYDROGENASE COMPLEX CATALYZES THE OVERALL 
CC CONVERSION OF PYRUVATE TO ACETYL-COA 2i CO(2). IT CONTAINS MULTIPLE 

CC COPIES OF THREE ENZYMATIC COMPONENTS ' PYRUVATE DEHYDROGENASE (El), 

CC D I HYDROL I POAM I DE ACETYLTRANSFERASE (E2) & LI POAM IDE DEHYDROGENASE 

CC ( E3 ) . 

CC -!- CATALYTIC ACTIVITY: ACETYL-COA + D I HYDROL I PAM I DE = CO A + 
CC S— ACET YLD I HYDROL I POAM I DE. 

CC -!- SUBUNIT: FORMS A 24-POLYPEPTIDE STRUCTURAL CORE WITH OCTAHEDRAL 
CC SYMMETRY. 

CC -!- COFACTOR: THE E2 COMPONENT CONTAINS THREE COVALENTLY-BOUND LIPOYL 
CC COFACTORS. 

CC -!- THERE ARE THREE COPIES OF THE LIPOYL BINDING DOMAIN. 

DR EMBL! XI 2455; AVDHLAAT. 

KW GLYCOLYSIS; TRANSFERASE; ACYLTRANSFERASE J DUPLICATION. 

FT DOMAIN 1 327 LIPOYL BINDING. 



I- I 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SO 



DOMAIN 

BINDING 

BINDING 

BINDING 

REPEAT 

REPEAT 

REPEAT 

SEQUENCE 



ci/ to OilNUiiNLa. 



38 %est A G ^§ible Copy CATALYTIC 



40 
157 
262 
1 

1 17 
222 
638 AA 



40 
157 
262 
1 16 
221 
327 

65044 



LIPOYL 
LIPOYL 
LIPOYL 



(PUTATIVE). 
(PUTATIVE), 
(PUTATIVE). 



1923634 CNJ 



Initial Score = 
Residue Identity = 
Gaps = 



9 Optimized Score = 14 
30% Matches = 15 

12 Conservative Substitutions 



X 10 20 30 

APMAEGGGKPHEVVKFMDVYQRSFXRP I ETLV- 



S i gn i -f i cance = 
Mismatches = 



X 

-XIXQEYP 



87 
23 
0 



AAAAAASPAPAPLAPAAAGPQE- 
210 X 220 



•VK VPD I GSAGKARV I EVLVK AGDOVQAEQSL I VLESDK ASME I PSPA 
230 240 250 260 270 



2. GUEST-346 

ODBISRAT 2— OXO I SO VALERATE DEHYDROGENASE PRECURSOR (EC 1.2.4 



ID ODBISRAT STANDARD; PRT ; 441 AA. 

AC PI I960; 

DT 01 -OCT- 1989 ( REL. 12. CREATED) 

DT 01 -OCT- 1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE 2-OXOISOVALERATE DEHYDROGENASE PRECURSOR (EC 1.2.4.4) ( BRANCHED-CHA I N 

DE ALPHA-KETO ACID DEHYDROGENASE COMPONENT (El)) (FRAGMENT). 

OS RAT (RATTUS NORVEGICUS). 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; rodent i a. 

rn c 1 ] ( sequence from n. a. ) 

ra zhang b. , kuntz m. j. , goodwin g. w. , harris r. a. , crabb d. w. ; 

rl j. biol. chem. 262=15220-15224(1987). 

cc -!- function! the branched-cha in alpha-keto dehydrogenase complex 

cc catalyzes the overall conversion of alpha-keto acids to acyl-coa 

cc and c0(2). it contains multiple copies of 3 enzymatic components 

cc branched-cha in alpha-keto acid decarboxylase (el), lipoamide 

cc acyltransferase (e2) and lipoamide dehydrogenase (e3). 

cc -!- catalytic activity • 3— methyl— 2— oxobutaneoate + lipoamide = 

cc s— ( 2— methylpropanol y ) d i hydrol i poam i de + co(2). 

cc -!- subcellular location: mitochondrial matrix. 

dr emblj j02827! rnbckda. 

kw oxidoreductase; flavoprotein; thiamine pyrophosphate; 

kw mitochondrion; transit peptide, 

ft non_ter 1 1 

ft transit < 1 40 mitochondrion. 

ft chain 41 441 2-oxoisovalerate dehydrogenase. 

sq sequence 441 aa; 50164 mw; 928338 cnj 

Initial Score = 10 Optimized Score = 14 Significance = 5.87 

Residue Identity = 33% Matches = 14 Mismatches = 25 

Gaps = 3 Conservative Substitutions = O 



X 10 20 30 X 

APMAEGGQKPHEVVKFMDV YQ RSFXRP I ETLVX I XQEYP 

il ii i i i i i lit ii 

ll ii i i l l l ill il 

KQSRKKVMEAFEQAERKLKPNPSLLFSDVYQEMPAGLRRQQESLARHLQTYGEHYPLDHFDK 
380 390 400 410 420 430 440 



3. GUEST— 346 

RADXSYEAST DNA REPAIR PROTEIN RADIO (GENE NAME = RADIO). 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
RN 
RA 
RL 
RN 
RA 
RL 
RN 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
KW 
FT 
FT 
SB 



RADX33YEAST STANDARD; PRTf 21 0 AA. 

POSS38 ; 

01-jan-1988 < rel. 06, created) 
ol-jan-1988 (rel. 06, last sequence update) 
01 -aug- 1988 (rel. 08, last annotation update) 
dna repair protein radio ( gene name: radio), 
baker's yeast ( saccharomyces cerevisiae). 
eukaryota; fungi; ascomycetes; hemiascomycetes. 
ci] ( sequence from n. a. ) 
weiss w. a. , friedberg e. c. ; 

EMBO J. 4=1575-1582(1985). 
C2] (CORRECTIONS) 
WEISS W. A. , FRIEDBERG E. C. ; 
EMBO J. 4 : 3907-3907 ( 1 985 ) . 
C3] (SEQUENCE FROM N. A. ) 

REYNOLDS P. , PRAKASH L. , DUMA IS D. , PEROZZI G. , 
EMBO J. 4 = 3549-3552(1985). 

-!- FUNCTION! THIS PROTEIN IS ONE OF 10 PROTEINS ( RAD 1 , 2,3,4,7,10,14, 
16,23 & MMS19) INVOLVED IN EXCISION REPAIR OF DNA DAMAGED WITH UV 
LIGHT, BULKY ADDUCTS, OR CROSS-LINKING AGENTS. OF THESE, THE RAD1 , 
2,3,4,10, AND MMS19 PROTEINS SEEM TO BE REQUIRED FOR INCISION OF 
DAMAGED DNA. 

-!- SUBCELLULAR LOCATION: NUCLEAR. 

-!- SIMILARITY: SOME WITH MAMMALIAN ERCC-1. 

embl; X02591; scradio. 

EMBLJ X05225; SCRAD10G. 

DNA repair; dna-binding; 

DOMAIN 17 23 

DNA_BIND 133 153 

SEQUENCE 2 1 O AA ; 243 1 1 



PRAKASH S. 



NUCLEAR PROTEIN. 

NUCLEAR LOCATION SIGNAL (PUTATIVE). 
PUTATIVE. 
MWJ 228068 CN; 



Initial 
Residue 
Gaps 



Score 
Identity 



7 Optimized Score = 14 
36% Matches = 17 

8 Conservative Substitutions 



S i gn i + i cance = 
Mismatches = 



5. 87 

22 
O 



X 10 20 30 X 

APMAEGGQKPHE — VVKFMDVYQR SFXRP I ET-LVX I XQ-EYP 



t ■ i 



QTSRR I NSNQV I NAFNQQKPEEWTDSK ATDD YNRKRPFRSTRPGKTVLVNTTQKENPLLNHLKSTNW 
SO X 60 70 80 90 100 X HO 



4. GUEST-346 

SKISHUMAN SKI ONCOGENE (GENE NAME: SKI). 

ID SKISHUMAN PRELIMINARY; PRT; 728 A A. 

AC PI 2755? 

DT O 1 —OCT— 1 989 (REL. 12, CREATED) 

DT 01 -OCT- 1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE SKI ONCOGENE (GENE NAME: SKI). 

OS HUMAN (HOMO SAPIENS). 

oc eukaryota; metazoa; chord ata; vertebrata; tetrapoda; mammalia; 

oc eutheria; primates. 

RN C 1 ] ( SEQUENCE FROM N. A. ) 

RA NOMURA N. , SASAMOTO S. , I SHI I S. , MATSUI M. , ISHIZAKI R. ; 

RL NUCLEIC ACIDS RES. 17:5489-5489(1989). 

CC -!- SIMILARITY: TO SNO ONCOGENE. 

DR EMBLJ X15218; X15218. 

KW ONCOGENE. 

SQ SEQUENCE 728 AA; 80004 MW; 2444050 CN ; 

Initial Score = 8 Optimized Score = 13 Significance = 4.89 

Residue Identity = 27% Matches = 16 Mismatches = 22 

Gaps = 20 Conservative Substitutions = O 



X lO 20 30 X 

APMAEGG QKPHEWKFMDVYQ RSFXRP I ETLVX I XQEYP 

III l l l i l l l l ll ii i 

ill i t i i i i i i it ii i 

SGLEAELEHLRGALEGGLDTKEAKEKFLHEVVK-MRVKQEEKLSAALQAKRSLHQELEFLRVAKKEKLREAT 
540 X 550 560 570 580 590 GOO X 

EAKRNL 
610 



5. GUEST-346 

□DB1SHUMAN 2-OXO I SOVALERATE DEHYDROGENASE PRECURSOR (EC 1.2.4 



ID 0DB1SHUMAN STANDARD » PRT, 444 AA. 

AC P12694J 

DT O 1 —OCT— 1 989 (REL. 12, CREATED) 

DT 01 -OCT- 1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE 2-OXO I SOVALERATE DEHYDROGENASE PRECURSOR (EC 1.2.4.4) ( BRANCHED-CHA I N 

DE ALPHA— KETO ACID DEHYDROGENASE COMPONENT (El)) (FRAGMENT). 

OS HUMAN (HOMO SAPIENS). 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; primates. 

RN C 1 ] ( SEQUENCE FROM N. A. ) 

RA FISHER C. W. , CHUANG J. L. , GRIFFIN T. A. , LAU K. S. , COX R. P. , 

RA CHUANG D. T. ; 

RL J. B I OL. CHEM. 264 * 3448-3453 ( 1 988 ) . 

RN [2] (LIVER, SEQUENCE OF 67-444 FROM N. A. ) 

RA ZHANG B. , CRABB D. W. , HARRIS R. A. ; 

RL GENE 69 s 1 59— 1 64 ( 1 988 ) . 

rn [3] (maple syrup disease mutation) 

ra zhang b. , edenberg h. j. , crabb d. w. , harris r. a. ; 

rl j. clin. invest. 83:1425-1429(1989). 

cc -!- functions the br anched-ch a i n alpha-keto dehydrogenase complex 

cc catalyzes the overall conversion of alpha— keto acids to acyl-coa 

cc and co(2). it contains multiple copies of 3 enzymatic components: 

cc branched-cha in alpha-keto acid decarboxylase (el), lipoamide 

cc acyltransferase (e2) and lipoamide dehydrogenase (e3). 

cc -!- catalytic activity: 3— methyl— 2— oxobutaneoate + lipoamide = 

cc s— ( 2— meth ylpropanol y ) d i h ydrol i po am i de + co(2). 

cc -!- subcellular location: mitochondrial matrix. 

cc -!- disease: maple syrup urine disease is caused by a mutation at 

cc position 433. 

dr embl; j04474; hskada. 

dr embl; m22221; hsbckdh. 

kw oxidoreductase; flavoprotein; thiamine pyrophosphate; 

kw mitochondrion; transit peptide, 

ft non_ter 1 1 

ft transit ( 1 45 mitochondrion. 

ft chain 46 444 2-0x0 i sovalerate dehydrogenase. 

ft variant 433 433 n -) y (in maple syrup urine disease). 

ft conflict 247 247 a -> d (in ref. 2). 

sq sequence 444 aa; 50218 mw5 936218 cn; 

Initial Score = lO Opt i m i zed Score = 13 S i gn i -f i cance = 4.89 

Residue Identity = 30% Matches = 14 Mismatches = 25 

Gaps = 7 Conservative Substitutions = O 



X 10 20 30 X 
APMAEGGQKPHEVVKFMDVY9 RSFXRP I ETLVX I XQ EYP 

it ii i i i i i ii t ii 

ti it ■ lift ii i it 

KQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAGLRKQQESLARHLQTNGEHYPLDHFDK 
390 X 400 41 0 420 430 440 



to. laUto I 

ODB135BOVIN 2-OXO^gygL|jj|$]n^ DEHYDROGENASE PRECURSOR (EC 1.2.4 



ID ODB133BOVIN STANDARD; PRT ; 455 AA. 

AC Pill 78 ; 

DT Ol— JUL— 1989 ( REL. 11, CREATED) 

DT 01— JUL— 1989 (REL. 11, LAST SEQUENCE UPDATE) 

DT Ol— JUL— 1989 (REL. 11, LAST ANNOTATION UPDATE) 

DE 2-0X0 1 SO VALERATE DEHYDROGENASE PRECURSOR (EC 1.2.4.4) ( BRANCHED-CHAIN 

DE ALPHA— KETO ACID DEHYDROGENASE COMPONENT (El)). 

OS BOVINE (BOS TAURUS). 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; artiodactyla. 

rn c u ( sequence from n. a. ) 

ra hu c. -w. c. , lau k. s. , griffin t. a. , chuang j. l. , fisher c. w. , 

ra cox r. p. , chuang d. t. ; 

rl j. biol. chem. 263 : 9007-90 1 4 ( 1988). 

cc -!- function '• the branched-chain alpha-keto dehydrogenase complex 

cc catalyzes the overall conversion of alpha-keto acids to acyl-coa 

cc and co(2). it contains multiple copies of 3 enzymatic components: 

cc branched-chain alpha-keto acid decarboxylase (el), lipoamide 

cc acyltransferase (e2) and lipoamide dehydrogenase (e3). 

cc -!- catalytic activity: 3— methyl— 2— oxobutaneoate + lipoamide = 

cc s— ( 2— methylpropanol y ) d i hydrol i poam i de + c0(2). 

cc -!- subcellular location: mitochondrial matrix. 

dr embl; j03759; btkad. 

kw oxidoreductase; flavoprotein; thiamine pyrophosphate; 

kw mitochondrion; transit peptide. 

ft transit 1 55 mitochondrion: 

ft chain 56 455 2-oxo i so valerate dehydrogenase. 

sq sequence 455 aa; 51678 mw; 991502 cn; 

Initial Score = 10 Optimized Score = 13 Significance = 4.89 

Residue Identity = 30% Matches = 13 Mismatches = 26 

Gaps = 3 Conservative Substitutions = O 



X 10 20 30 X 

APMAEGGQKPHEVVKFMDVYQ RSFXRP I ETLVX I XQEYP 

it ti i i I i i it it 

ii ii i i I I I it II 

KQSRKKVMEAFEQAERKLKPNPSLIFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFEK 
400 X 410 420 430 440 X 450 



7. GUEST-346 

DRTSSLE I MA D I H YDROFOL ATE REDUCTASE (EC 1. 5. 1. 3) / THYM I DYLATE 



ID DRTS93LEIMA STANDARD; PRT; 520 AA. 

AC P07382 ; 

DT Ol -APR- 1988 (REL. 07, CREATED) 

DT Ol -APR- 1988 (REL. 07, LAST SEQUENCE UPDATE) 

DT 01 -MAR- 1989 (REL. 10, LAST ANNOTATION UPDATE) 

DE DIHYDROFOLATE REDUCTASE (EC 1. 5. 1. 3) / THYM I DYLATE SYNTHASE 

DE (EC 2. 1. 1. 45). 

OS LEISHMANIA MAJOR. 

oc eukaryota; protozoa; sarcomastigophora. 

RN C 1 ] ( SEQUENCE FROM N. A. ) 

RA BEVERLEY S. M. , ELLENBERGER T. E. , CORDINGLEY J. S. ; 

RL PROC. NATL. ACAD. SCI. U. S. A. 83:2584-2588(1986). 

CC -!- CATALYTIC ACTIVITY: 5,6,7, 8— TETRAHYDROFOLATE + NADP(+) = 

CC 7 , 8— D I HYDROFOL ATE + NADPH. 

CC -!- CATALYTIC ACTIVITY: 5 , 1 O-METHYLENETETRAH YDROFOL ATE + DUMP = 
CC DIHYDROFOLATE + DTMP. 

CC -!- PATHWAY: ESSENTIAL STEP FOR DE NOVO GLYCINE AND PURINE SYNTHESIS, 
CC DNA PRECURSOR SYNTHESIS, AND FOR THE CONVERSION OF DUMP TO DTMP. 

DR PIR; A23403; RDLNTS. 

DR EMBL ; Ml 2734J LMDHFRTS. 



r\ w 
KW 
FT 
FT 
FT 
SO 



rlUI — I irUINU I XLJINMi CIM<^ T IMC I UA 1 UUrsCUUO I HOI. f i rst - II NOr L—rsMOC i mur I 



METHYLTRANSFERi 
DOMAIN 
DOMAIN 234 
ACT_SITE 400 
SEQUENCE 520 AA; 



biosynthesis; one-carbon metabolism. 
) i h ydrofol ate reductase. 
520 thym i dylate synthase. 

400 by homology. 

58688 MW; 1352514 CN; 



Initial 
Residue 
Gaps 



Score 
I dent i ty 



4 Optimized Score = 13 
29% Matches = 15 

12 Conservative Substitutions 



S i gn i -f i cance 
Mismatches 



4. 83 

24 
O 



X lO 20 30 X 

APMAEG GGKPHEWKFMDVYQRSFXRP 1 ET — LVX I XQEYP 



SSK ATVEELLAPLPEGQRAAAAQDVVVVNGGLAEALRLLARPLYCSS I ETAYCVGGAQVYADAMLSPC I EK 
110 X 120 130 140 150 160 X 170 



8. 



GUEST-346 
H3SSNEUCR 



HI STONE H3. 



07, CREATED) 

07, LAST SEQUENCE UPDATE) 
07, LAST ANNOTATION UPDATE) 



PYRENOMYCETES. 



ID H335NEUCR STANDARD J PRT ! 135 AA. 

AC P07041 ; 

DT 01 -APR- 1988 ( REL. 

DT Ol -APR- 1988 (REL. 

DT 01 -APR- 1988 (REL. 

DE HI STONE H3. 

OS NEUROSPORA CRASSA. 

oc eukaryota; fungi; ascomycetes 

RN C 1 ] ( SEQUENCE FROM N. A. ) 

RA WOUDT L. P. , PAST INK A. , KEMPERS-VEENSTRA A 

RA MAGER W. H. , PLANTA R. J. ; 

RL NUCLEIC ACIDS RES. 11:5347-5360(1983). 

DR EMBL; X01612; NCHISH3. 

KW CHROMOSOMAL PROTEIN; NUCLEOSOME CORE. 

FT INI T_MET 0 O 

SQ SEQUENCE 135 AA; 15303 MW ; 85124 CN! 



JANSEN A. E. M. 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



7 Opt i m i zed 
32% Matches 
14 Conservative 



Score = 



13 
17 

Subst i t ut i ons 



S i gn i -f i cance 
Mismatches 



= 4. 



89 
21 
O 



X lO 20 30 X 

APMAEGGQKPHE VVKFMDV YQRS FXR P I ETLV-X I XQEYP 



QLASKAARKSAPSTGGVKKPH— RYKPGTVALRE I RRYQKSTELL I RKLPFQRLVRE I AQDFK SDLRFQSS A I 
20 30 40 50 60 70 80 



9. GUEST-346 

PYC$ YEAST PYRUVATE CARBOXYLASE (EC 6.4. 1. 1) (PYRUVIC CARBOXY 

ID PYC93YEAST STANDARD; PRT; 1178 AA. 

AC PI 1154; 

DT 01-JUL-1989 (REL. 11, CREATED) 

DT 01-JUL-1989 (REL. 11, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE PYRUVATE CARBOXYLASE (EC 6.4. 1. 1) (PYRUVIC CARBOXYLASE) (PCB) (GENE 

DE NAME : PYV > . 

OS BAKER'S YEAST ( SACCHAROMYCES CEREVISIAE). 

oc eukaryota; fungi; ascomycetes; hem i ascomycetes. 

RN C13 (SEQUENCE FROM N. A. , AND PARTIAL SEQUENCE) 

RA LIM F. , MORRIS C. P. , OCCHIODORO F. , WALLACE J. C. ; 

RL J. BIOL. CHEM. 263:11493-11497(1988). 

RN [2] (SEQUENCE OF 1003-1178 FROM N. A. ) 

RA MORRIS CP., LIM F. , WALLACE J. C. ; 



CC -!- FUNCTION: F^UVATjE. C^BDXYLASE CATALYZES A 2-STEP REACTION, 
CC INVOLVING THE ATP^E^ENDENT CARBOXYLAT I ON OF THE COVALENTLY 

CC ATTACHED BIOTIN IN THE FIRST STEP AND THE TRANSFER OF THE 

CC CARBOXYL GROUP TO PYRUVATE IN THE SECOND. 

CC -!- CATALYTIC ACTIVITY: ATP + PYRUVATE + HCO<3><-> = ADP + 

CC ORTHOPHOSPHATE + OXALOACETATE. 

CC -!- PATHWAY: GLUCONEOGENES I S. 

CC -!- SUBUNIT: TETRAMER. 

CC -!- COFACTOR: BIOTIN, AND ZINC. 

CC -!- SIMILARITY: WITH OTHER BIOTIN CARBOXYLASES, L IPO AM IDE 
CC TRANSFERASES AND CARBAMYL PHOSPHATE SYNTHETASES. 

DR EMBLJ J03889! SCPCB. 

KW LIGASE; MULTIFUNCTIONAL ENZYME; BIOTIN; GLUCONEOGENES I S ; ZINC. 

FT BINDING 1135 1135 BIOTIN (BY SIMILARITY). 

FT SIMILAR 1GO 330 CARBAMOYL PHOSPHATE SYNTHETASES. 

FT SIMILAR 350 470 WITH OTHER BIOTIN CARBOXYLASES. 

FT SIMILAR 1086 1178 WITH OTHER BIOTIN CARRIER ' PROTEINS AND 

FT WITH LIPOAMIDE ACETYLTRANSFERASE. 

SQ SEQUENCE 1178 AA; 130098 MW; 7059028 CN; 

Initial Score = 8 Optimized Score = 13 Significance = 4.89 

Residue Identity = 31% Matches = 14 Mismatches = 25 

Gaps = 5 Conservative Substitutions = O 



X lO 20 30 X 
APMAEGGQKPHEVVKFMDV— YQRSFXRP I ET LVX I XQEYP 

t ii l l III I J I III 

i ii i i ill ill lit 

ECDVASYNMYPRV YEDFGKMRETYGDLSVLPTRSFLSPLETDEE I EVV I EQGKTL I I KLQAVGD 
1000 X 1010 1020 1030 1040 1050 



10. GUEST-346 

VP2SSBTV13 VP2 PROTEIN (OUTER CAPS ID PROTEIN VP2) (GENE NAME: 



ID VP2SBTV13 STANDARD; PRT J 959 AA. 

AC P12395; 

DT 01 -OCT- 1989 ( REL. 12, CREATED) 

DT 01 -OCT- 1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT 01 -OCT- 1989 (REL. 12, LAST ANNOTATION UPDATE) 

DE VP2 PROTEIN (OUTER CAPS ID PROTEIN VP2) (GENE NAME: L2). 

OS BLUETON6UE VIRUS (SEROTYPE 13). 

oc viridae; ds-rna nonenveloped viruses; REOVIRIDAE. 

RN [ U ( SEQUENCE FROM N. A. ) 

RA FUKUSHO A. , RITTER G. D. , ROY P. ; 

RL J. GEN. VIROL. 68 = 2967-2973(1987). 

CC -!- FUNCTION: THE VP2 PROTEIN IS ONE OF THE TWO PROTEINS (WITH VPS) 
CC WHICH CONSTITUTE THE VIRUS PARTICLE OUTER CAPS ID. IT IS THE 

CC MAJOR TARGET OF THE HOST IMMUNOGENIC RESPONSE. 

DR PIR; A27495; P2XR13. 

KW COAT PROTEIN. 

SQ SEQUENCE 959 AA; 112563 MW? 4839211 CN; 

Initial Score = 7 Optimized Score = 13 Significance = 4.89 

Residue Identity = 29% Matches = 14 Mismatches = 25 

Gaps = 9 Conservative Substitutions = O 



X 10 20 30 X 

APMAEGGQKPHEV VKFMD VYQRSFXRP I ETLVX I XQEYP 

ii ii ii ill ii ■■ i 

ii iiit ill ii ii i 

FPPYFDQWTYVPMFNAR I KPCEVEVGERKN I DPYVKRTHRPLK ADC I ELMRYHMSQYMDLRVSLQGTS 
520 X 530 540 550 560 570 



Results -file guest-346. res made by alexk on Thu 26 Apr 90 9:20:03-PDT. 



Query sequence being compared: GUEST-34G 
Number of sequences searched: 12476 
Number of scores above cuto-f-f: 3893 

Results o-f the initial comparison o-f GUEST-346 with: 
Data bank : PIR 21.0, all entries 

10000- 

N 

U 5000- 

M — * * 

B — * 

E 

R 

- # 

□ 

F 1000- 
S 

E 500- * 

Q 

U 

E 

N * 

C * 
E 

S ioo- 



10- 



SCORE O: 1 !2 13 4 16:7:8 9 lO 

STDEV -1 0 1 2 3 4 



PARAMETERS 



S i m i 1 ar i ty matr i x 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cuto-f-f score 
Randomization group 



Uni tary 
1 

1. oo 

O. 05 
0 
0 



Initial scores to save 20 
Optimized scores to save 20 



K-tuple 

Joining penalty 
W i ndow s i ze 



Alignments to save 
Display context 



2 
20 
32 



10 
lO 



SEARCH STATISTICS 



Scores * 



Mean 
3 



Median Standard Deviation 

4 1. 39 



Times < 



CPU 
00 = 01 s 12. 02 



Total Elapsed 
00=04= 55. OO 



Number of residues* 3406022 
Number of sequences searched' 12476 
Number of scores above cutoff: 3893 

Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 100% identical sequence to the query sequence was not found. 



The list of best scores is= 



Sequence Name 



Description 



Length 



Ini t. 

Score 



□pt. 

Score 



Sig. Frame 







##*# 5 standard deviations above mean 


#### 










1. 


SHNC 


Phosphor ibosyl -AMP cyclohydrol 


863 


lO 


12 


5. 


03 


0 


2. 


NBMSH 


Complement factor H precursor 


1234 


10 


1 1 


5. 


03 


O 


3. 


PFHUG1 


Platelet— derived growth factor 


21 1 


10 


lO 


5. 


03 


0 


4. 


E29504 


Mercuric reductase - Staphyloc 


547 


lO 


1 1 


5. 


03 


0 


5. 


B28964 


Platelet— derived growth factor 


196 


10 


lO 


5. 


03 


0 


6. 


A28073 


2-Oxoi sovalerate dehydrogenase 


455 


10 


13 


5. 


03 


O 


7. 


A28964 


Platelet— derived growth factor 


21 1 


10 


lO 


5. 


03 


O 


8. 


A29468 


2-Oxoi sovalerate dehydrogenase 


441 


10 


14 


5. 


03 


0 






4 standard deviations above mean 












9. 


XUECDP 


UDP-acety 1 g 1 ucosam i ne acyltran 


1 15 


9 


lO 


4. 


32 


O 


lO. 


SHBY 


Phosphor ibosyl -AMP cyclohydrol 


799 


9 


10 


4. 


32 


O 


1 1. 


DTECC 


Aspartate carbamoyl transferase 


31 1 


9 


10 


4. 


32 


0 


12. 


KFHU1 


Coagulation factor XI precurso 


625 


9 


9 


4. 


32 


0 


13. 


OQHU 


Hemopexin precursor - Human (f 


441 


9 


10 


4. 


32 


0 


14. 


XMECDD 


dedD protein - Escherichia col 


21 1 


9 


1 1 


4. 


32 


0 


15. 


W2WLDP 


Probable E2 protein - Deer pap 


416 


9 


lO 


4. 


32 


0 


16. 


P5XR10 


Outer caps id protein VPS - Blu 


526 


9 


9 


4. 


32 . 


0 



if. ^Bro^ uene pimein — dch^c^i iupnag 

18. A23162 EK^fAtfairab^tff <+ragment> 

19. JT0315 Parasporal crystal protein - B 

20. S00049 Aspartate carbamoyl transferase 

The scores below are sorted by opt i mi zed score. 
Significance is calculated based on optimized score. 

A 100% identical sequence to the query sequence was not -found. 





a 


1 






u 


154 


9 


IO 


4. 


32 


O 


1 135 


9 


1 1 


4. 


32 


O 


31 1 


9 


10 


4. 


32 


0 



The list 0+ best scores is: 

Init. Opt. 

Sequence Name Description Length Score Score Sig. Frame 







5 standard deviations above mean 


#■*#* 










1. 


A29468 


2-Oxoisovalerate dehydrogenase 


441 


IO 


14 


5. 


96 


0 


2. 


XYECCR 


Chemotaxis protein methyl ase - 


286 


4 


14 


5. 


96 


O 


3. 


B22726 


RADIO protein — Yeast < Sacchar 


195 


7 


14 


5. 


96 


O 


4. 


A24576 


RADIO protein - Yeast (Sacchar 


210 


7 


14 


5. 


96 


O 






4 standard deviations above mean 


#### 










5. 


VWPC2 


Coat proteins VP2 and VP3 - Mo 


319 


4 


13 


4. 


97 


0 


6. 


A28073 


2-Oxo i sova 1 erate dehydrogenase 


455 


IO 


1 3 


4. 


97 


0 


7. 


VVVP2 


Coat proteins VP2 and VP3 - Mo 


319 


4 


13 


4. 


97 


0 


8. 


RDLNTS 


Dihydro-folate reductase/thymid 


520 


4 


13 


4. 


97 


0 


9. 


A29233 


Pyruvate carboxylase - Yeast < 


1 178 


8 


13 


4. 


97 


0 


IO. 


A05032 


Hypothetical protein 548 (homo 


548 


7 


13 


4. 


97 


O 


1 1. 


P2XR13 


VP2 protein - Bluetongue virus 


959 


7 


13 


4. 


97 


0 


12. 


B28814 


Ig heavy chain V region - Chic 


1 16 


7 


13 


4. 


97 


0 


13. 


A28912 


Kinase-related protein seven le 


2554 


5 


13 


4. 


97 


0 






3 standard deviations above mean 


**#* 










14. 


S00373 


Hi stone H3 - Wheat 


135 


7 


12 


3. 


97 


0 


15. 


GNWVY 


Genome polyprotein - Yellow fe 


341 1 


7 


12 


3. 


97 


O 


16. 


A25564 


Hi stone H3 - Rice 


136 


7 


12 


3. 


97 


O 


17. 


A26014 


Hi stone H3 - Wheat 


136 


7 


12 


3. 


97 


0 


18. 


HVMS3 


Ig heavy chain precursor V reg 


117 


7 


12 


3. 


97 


0 


19. 


AOS 129 


Cholera enterotoxin. A chain p 


258 


7 


12 


3. 


97 


O 


20. 


A27126 


Multidrug resistance protein 1 


572 


7 


12 


3. 


97 


0 



GUEST-346 
A29468 



2-0xoi sova 1 erate dehydrogenase (lipoamide)i El— alp 



ENTRY 
TITLE 



ALTERNATE-NAME 
SOURCE 
ACCESSION 
REFERENCE 
^Authors 

tt Journal 
ttTitle 



SUMMARY 
SEQUENCE 



A29468 ttType Protein (fragment) 

2-Oxoi sova 1 erate dehydrogenase (lipoamide)i El -alpha 
chain precursor - Rat (fragment) ttEC-number 
1. 2. 4. 4 

branched-chain alpha-keto acid dehydrogenase 
Rattus norvegicus #Common-name Norway rat 
A29468 

(Sequence translated from the mRNA sequence) 
Zhang B. » Kuntz M. J. , Goodwin G. W. » Harris R. A. * 

Crabb D. W. 
J. Biol. Chem. (1987) 262=15220-15224 
Molecular cloning of a cDNA for the El -alpha subunit 

of rat liver branched chain alpha-ketoacid 

dehydrogenase. 

ttLength 441 #Checksum 8882 



Initial Score 
Residue Identity 
Gaps 



IO Optimized Score = 14 
33% Matches = 14 

3 Conservative Substitutions 



S i gn i f i cance 
Mismatches 



5. 96 

25 
O 



10 



20 



30 



t-tM'iHtiauatorNi-rttv vrvrnuv rta ixor- Ar-^t- a. 1 i LVAiAutrr 



' • Befet Available Cdpv' ... . . 

KQSRKKVMEAFEQAERKLKPNPSLLFSDvYqEMPAQLRRQQESLARHLQTYGEHYPLDHFDK 



380 



390 



400 



410 



420 



430 



440 



GUEST-346 
XYECCR 



Chemotaxis protein methyl ase - Escherichia col i 



ENTRY 
TITLE 

DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 
^Authors 
# Journal 
#Ti tie 

COMMENT 



GENETIC 

#Map-pos i t i on 

ttName 
SUPERFAM I L Y 
KEYWORDS 
SUMMARY 
SEQUENCE 



XYECCR ttType Protein 

Chemotaxis protein methyl ase — Escherichia col i 

8EC-number 2. 1. l. - 
28-Dec-1987 ^Sequence 28-Dec-1987 ttText 28-Dec-1987 

127. O 1. O 1.0 1. O 1. O 

Escherichia col i 
C25195 

(Sequence translated from the DNA sequence) 
Mutoh N. » Simon M. I. 
J. Bacteriol. (1986) 165* 161-166 

Nucleotide sequence corresponding to five chemotaxis 

genes in Escherichia col i. 
This protein appears to be a methyl esterase 

specifically responsible for removing the methyl 
group from the gamma-glutamyl methyl ester 
residues in the methy 1 -accept ing chemotaxis 
proteins (MCP). The MCP methyl at ion state of the 
cell is crucial for sensory responses and 
adaptat i ons. 

42 

cheR 

ttName chemotaxis protein methyl ase 
methy 1 trans ferase\ chemotaxis response 
ttMoleculai — weight 32750 ^Length 286 ttChecksum 10 



Initial 
Res i due 
Gaps 



Score = 
Identity = 



4 Optimized Score = 14 
34% Matches = 15 

5 Conservative Substitutions 



Significance = 
Mismatches = 



5. 96 
24 
O 



X 10 20 30 X 

APMAEGGGKPHEWK — FMDVYQRSFXR P I ETLVX I XOEYP 



VFASD I DTEVLEK ARSG I YRHEELKNLTPQQLGRYFMRGTGPHEGLVRVRQELANY VDFAPLNL 
ISO 160 170 180 190 200 X 210 



GUEST-346 
B22726 



RADIO protein - Yeast ( Saccharomyces cerevisiae) 



ENTRY 

TITLE 

SOURCE 

ACCESSION 

REFERENCE 
^Authors 
tt Journal 
ttTitle 



GENETIC 

ttMap-pos i t i on 
#Name 



SUMMARY 
SEQUENCE 



B22726 ttType Protein 

RADIO protein - Yeast (Saccharomyces cerevisiae) 
Saccharomyces cerev i s i ae 
B22726 

(Sequence translated from the DNA sequence) 
Weiss W. A. i Friedberg E. C. 
EMBO J. (1985) 4=1575-1582 

Molecular cloning and characterization of the yeast 
RADIO gene and expression of RADIO protein in E. 
col i. 

8R 

RADIO 

ttMolecular-weight 22614 ttLength 195 ^Checksum 1203 



j. i I i Udi ouui ts — i Up u i Hi i <^e<j duui b = 14 i> l gn 1 t i it-ti — Z>. ofc> 

Residue Identity = Bel^vailcfefeR!^^ = 17 Mismatches = 22 

Gaps = 8 Conservative Substitutions = O 

X 10 20 30 X 

APMAEGGGKPHE WKFMDVYQR SFXRP I ET-LVX I XO-EYP 

I III! I III II III III 

1 ill! I III II III lit 

OTSRR I NSNQV I NAFNQOKPEEWTDSK ATDDYNRKRPFRSTRPGKTVLVNTTGKENPLLNHLKSTNW 
50 X 60 70 80 90 100 X HO 



X. GUEST-346 
A24576 

ENTRY 
TITLE 
SOURCE 
ACCESSION 
REFERENCE 
^Authors 

# Journal 
GENETIC 

WName 
SUPERF AM I L Y 
KEYWORDS 
SUMMARY 
SEQUENCE 



RADIO protein - Yeast (Saccharomyces cerevisiae) 
A24576 #Type Protein 

RADIO protein - Yeast (Saccharomyces cerevisiae) 

Saccharomyces cerevisiae 

A24576 

(Sequence translated from the DNA sequence) 
Reynolds P. » Prakash L. i Duma is D. i Perozzi G. « 

Prakash S. 
EMBO J. (1985) 4:3549-3552 

RADIO 

ttName Gene RADIO protein 
UV 

#Mo 1 ecu 1 ar-we i ght 24311 tfLength 210 ^Checksum 5515 



Initial Score = 
Residue Identity = 
Gaps = 



7 Opt i m i zed Score = 14 S i gn i f i cance = 5. 96 

36% Matches = 17 Mismatches = 22 

S Conservative Substitutions = O 



X 10 
APMAEGGQKPHE- 



20 30 X 

-WKFMDVYQR SFXRP I ET-LVX I XQ-E YP 



QTSRR I NSNQV I NAFNQGKPEEWTDSK ATDDYNRKRPFRSTRPGKTVLVNTTQKENPLLNHLKSTNW 
50 X 60 70 80 90 100 X HO 



GUEST-346 
VVVPC2 



Coat proteins VP2 and VP3 - Mouse polyomavirus 



ENTRY 
TITLE 

DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 
ttAuthors 
# Journal 
ttTitle 



#Res i dues 

REFERENCE 
#Authors 
#Ci tat ion 
ttRes i dues 

COMMENT 

COMMENT 
SUPERF AM I L Y 
KEYWORDS 



VVVPC2 ttType Protein 

Coat proteins VP2 and VP3 - Mouse polyomavirus 

(strain Craw-ford sma 1 1 -p 1 aque ) 
30-Jun-1989 ttSequence 30-Jun-1989 #Text 30-Jun-1989 
1 1 63. 0 3. 0 1.0 1.0 2. 0 
mouse po 1 yomav i rus 
E28838 

(Sequence translated -from the DNA sequence) 

Rothwell V. M. , Folk W. R. 

J. Virol. (1983) 48:472-480 

Comparison o-f the DNA sequence o-f the Crawford 

small— plaque variant o-f polyomavirus with those of 
pol yomav i ruses A2 and strain 3. 

1-319 <R01 > 

(Sequence translated -from the DNA sequence) 
Rothwel 1 V. M. 

submitted to GenBank. November 1985 
1-319 (R02> 

The DNA sequence was obtained -from GenBank . release 
56. 0. 

This virus is a member o-f the family Papovav i r i dae. 
ttName polyoma coat proteins VP2 and VP3 
late protein 



1-tH I UKt 
1-319 
1 16-313 
SUMMARY 
SEQUENCE 



BestAvailsfflSlStfiSp coat Protein VP2 <VP2>\ 
ttProtein coat protein VP3 <VP3> 
#Mo 1 ecu 1 ar-we i ght 34827 ttLength 319 ^Checksum 



2781 



Initial Score 
Res i due I dent i t y 
Gaps 



4 Optimized Score = 13 
28% Matches = 14 

lO Conservative Substitutions 



S i gn i f i cance 
Mismatches 



4. 97 

25 
0 



X lO 
APMAEGGGKP- 



20 30 X 

-HE WKFMDV YQRSFXR — P I ETLVX I XQEYP 



QRRALFNR I EGSMGNGGPTPA AH I QDESGEV I KF YQAQ WSHQRVTPDWMLPL I LGLYGD I TPTWAT V I 
240 250 260 270 280 290 300 



GUEST-346 
A28073 



2-0xoi sovalerate dehydrogenase ( 1 ipoamide) « El alp 



ENTRY 
TITLE 

ALTERNATE— NAME 



SOURCE 
ACCESSION 
REFERENCE 
ttAuthors 

# Journal 
ttTitle 



FEATURE 
56-455 

SUMMARY 
SEQUENCE 



A28073 #Type Protein 

2— Oxoi sovalerate dehydrogenase ( 1 ipoamide) » El alpha 

chain precursor - Bovine #EC-number 1. 2. 4. 4 
branched-cha i n alpha-keto acid dehydrogenase El 
alpha chain\ branched-cha i n alpha— keto acid 
decarboxy 1 ase 
Bos primi genius taurus #Common-name cattle 
A28073 

(Sequence translated -from the mRNA sequence) 
Hu C. W. C. i Lau K. S. » Griffin T. Pi. . Chuang J. L. , 

Fisher C. W. » Cox R. P. » Chuang D. T. 
J. Biol. Chem. < 1988) 263:9007-9014 
Isolation and sequencing of a cDNA encoding the 
decarboxylase (El) -alpha precursor of bovine 
branched-cha in alpha-keto acid dehydrogenase 
complex. Expression of El -alpha mRNA and subunit 
in maple-syrup-urine-disease and 3T3-L1 cells. 

ttProtein 2-oxoi sovalerate dehydrogenase 
(1 ipoamide). El alpha chain (MAT) 
#Mo 1 ecu 1 ar-we i ght 51678 ttLength 455 ttChecksum 4630 



Initial 
Res i due 
Gaps 



Score 
Identity 



10 Optimized 
30% Matches 

3 Conservat i ve 



Score = 



13 
13 

Substitutions 



S i gn i f i cance 
Mismatches 



4. 97 

26 
0, 



X 10 20 30 X 

APMAEGGQKPHE VVKFMDVYQ RSFXRP I ETLVX I XQEYP 

ii ii i i I ■ i II it 

ti ii i i I i i ii ii 

KQSRKKVMEAFEQAERKLKPNPSLIFSDVYQEMPA9LRKQQESLARHLQTYGEHYPLDHFEK 
400 X 410 420 430 440 X 450 



?. GUEST-346 
VVVP2 

ENTRY 
TITLE 
DATE 

PLACEMENT 
SOURCE 
ACCESSION 
COMMENT 

REFERENCE 



Coat proteins VP2 and VP3 - Mouse po 1 yornav i r us 
WVP2 #Type Protein 

Coat proteins VP2 and VP3 - Mouse pol yornav irus 
31-JU1-1980 ^Sequence 08-0ct-1981 ttText 27-NOV-1985 
1 1 63. 0 3. O 1 . O 1.0 1 . O 
mouse pol yornav irus 
A03635 

The VP2 sequence of strain A2 is shown; VP3 

corresponds to residues 116-319. 
(Strain A2. sequence translated from the DNA 



ttAuthors 

tt Journal 
REFERENCE 

ttAuthors 
tt Journal 
ttComment 



SUMMARY 
SEQUENCE 



Walsh J. E. 



Gr iff in B. E. 
Nature (1980) 283:445-453 

(Strain 3> sequence translated from the DNA 
sequence) 

FriedTnann T. i Esty A. t LaPorte P. i Deininger P. 
Cell (1973) 17=715-724 

This sequence differs from that shown in having 
78-Asn. 219-Val, 276-Pro. 277-Gly, 278-Gly, and 
279-A 1 a. 

ttMolecular-weight 34800 ttLength 319 ttChecksum 2886 



Initial Score = 
Residue Identity = 
Gaps = 



4 Optimized Score = 13 
28% Matches = 14 

lO Conservative Substitutions 



S i gn i f i cance = 
Mismatches = 



4. 97 

25 
O 



X 10 
APMAEGGQKP- 



20 30 X 

-HEWKFMDVYQRSFXR — P I ETLVX I XOEYP 



QRRALFNR I EGSMGNGGPTPA AH I GDESGEV I KF YQAQ WSHQR VTPDWMLPL I LGLYGD I TPTWATV I 
240 250 260 270 280 290 300 



GUEST-346 
RDLNTS 



Di hydro-folate reductase/thymidy late synthase - 



ENTRY 
TITLE 



DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 
ttAuthors 
tt Journal' 
ttTi tie 



SUPERF AM I L Y 



KEYWORDS 

SUMMARY 

SEQUENCE 



synthase - 

1. 3 ttEC-number 



RDLNTS ttType Protein 

D i hydrof o 1 ate reductase/t hym i dy 1 ate 

Le i s hman i a t rop i ca ttEC-number 1 . 5. 

2. 1. 1. 45 

28-Dec-1987 ^Sequence 28-Dec-1987 #Text 31 -Mar- 1988 

128. 0 2. 0 1. O 1.0 1.0 
Leishmania tropica major 
A23403 

(Sequence translated -from the DNA sequence) 
Bever ley S. M. . Ell enberger T. E. , Cord i ng 1 ey J. S. 
Proc. Nat. Acad. Sci. USA (1986) 83:2584-2588 
Primary structure of the gene encoding the 

bi functional dihydrofolate reductase-thymidy late 

synthase of Leishmania major. 
1-520 ttName DHFR-TS bi functional enzymeN 
1-233 ttName dihydrofolate reductases 
234-520 ttName thym idyl ate synthase 
bi functional enzyme\ ox i doreductaseX synthase 
ttMolecular-weight 58688 ttLength 520 ttChecksum 2419 



Initial 
Residue 
Gaps 



Score = 
Identity = 



4 Optimized Score = 13 
29% Matches = 15 

12 Conservative Substitutions 



S i gn i f i cance = 
Mismatches = 



4. 97 

24 
0 



X 10 20 

APMAEG GGKPHEWKFMDVYQRSFXRP- 



30 X 
■IET — LVXIXQEYP 



SSK ATVEELLAPLPEGQRA AAAQDVVVVNGGL AEALRLLARPLYCSS I ETAYCVGGAQVYADAMLSPC I EK 
HO X 120 130 140 150 160 X 170 



GUEST-346 
A29233 



Pyruvate carboxylase - Yeast ( Saccharomyces 



ENTRY 
TITLE 



A29233 ttType Protein 

Pyruvate carboxylase — Yeast 
cerev i s i ae ) ttEC-number 6. 4. 



( Saccharomyces 
1. 1 



hi_ i ckinh 1 1— iNHnt pyruvic carDOxyiase 

source Sacc ^r8^fi < Bfe <SSftP v i 3 1 ae 

ACCESSION A29233N A29722 

REFERENCE (Sequence translated from the DNA sequence) 

SAuthors Lim F. , Morris CP. . Occhiodoro F. i Wallace J. C. 

» Journal J. Biol. Chem. (1988) 263:11493-11497 

ttTitle Sequence and domain structure of yeast pyruvate 

carboxylase. 

SUMMARY ttMoleculai — weight 130098 ttLength 1178 ttChecksum 6631 
SEQUENCE 

Initial Score = 8 Optimized Score = 13 Significance = 4.97 

Residue Identity = 31% Matches = 14 Mismatches = 25 

Gaps = 5 Conservative Substitutions = O 

X lO 20 30 X 
APMAEGGQKPHE WKFMDV- YQRSFXRP I ET LVX I XQEYP 

i ii i i til ill ill 

I ii ■ ■ ill ill lit 

ECDVASYNMYPRVYEDFQKMRETYGDLSVLPTRSFLSPLETDEE I EW I EQGKTL I I KLQAVGD 
lOOO X 1010 1020 1030 1040 1050 

10. GUEST-346 

A05032 Hypothetical protein 548 (homolog of E. col i rpoC) 

ENTRY A05032 #Type Protein 

TITLE Hypothetical protein 548 (homolog of E. col i rpoC) - 

Common tobacco chloroplast 
SOURCE chloroplast Nicotiana tabacum ttCommon-name common 

tobacco 
ACCESSION A05032 

REFERENCE ( cv. Bright Yellow 4i sequence translated from the 

DNA sequence) 
ttAuthors Sugiura M. 

ttCitation submitted to EMBL. August 1986. in computet — readable 

form 

REFERENCE ( cv. Br i ght Ye 1 1 ow 4 ; gene organ i zat i on . si tes , and 

features) 

ttAuthors Shinozaki K. i Ohme M. . Tanaka M. ■ Wakasugi T. i 

Hayashida N. . Matsubayashi T. » Zaita N. » 
Chunwongse J. » Obokata J. » Yamaguch i -Sh i nozak i K. » 
Ohto C. » Torazawa K. » Meng B. Y. » Sugita M. » Deno 
H. . Kamogashira T. . Yamada K. . Kusuda J. . Takaiwa 
F. i Kato A. . Tohdoh N. » Shimada H. . Sugiura M. 

# Journal EMBO J. (1986) 5=2043-2049 

GENETIC 

ttStart-codon AGG 
COMMENT The code is Q5NT48. 

SUMMARY ttMolecular-weight 63034 ttLength 548 ttChecksum 2349 

SEQUENCE 

Initial Score = 7 Optimized Score = 13 Significance = 4.97 

Residue Identity = 35% Matches = 14 Mismatches = 22 

Gaps = 4 Conservative Substitutions = 0 

X 10 20 30 X 

APMAEGGQKPHEWK— FMDVYQRSFXRP I ETLVX I XQEYP 

it i i i t i i i i ill t 

till i i i i i i lit i 

DTLLDNG I RGQPMRDGHNK VYKSFSDV I EGKEGRFRETLLGKRVDYSGRSV I WGPS 

200 210 220 230 240 X 250 



